Collective Tweet Wikification based on Semi-supervised Graph Regularization

  • Hongzhao Huang ,
  • Yunbo Cao ,
  • Xiaojiang Huang ,
  • Heng Ji ,

The 52nd Annual Meeting of the Association for Computational Linguistics |

Published by Association for Computational Linguistics | Organized by Association for Computational Linguistics

Wikification for tweets aims to automatically identify each concept mention in a tweet and link it to a concept referent in a knowledge base (e.g., Wikipedia). Due to the shortness of a tweet, a collective inference model incorporating global evidence from multiple mentions and concepts is more appropriate than a noncollecitve approach which links each mention at a time. In addition, it is challenging to generate sufficient high quality labeled data for supervised models with low cost. To tackle these challenges, we propose a novel semi-supervised graph regularization model to incorporate both local and global evidence from multiple tweets through three fine-grained relations. In order to identify semanticallyrelated mentions for collective inference, we detect meta path-based semantic relations through social networks. Compared to the state-of-the-art supervised model trained from 100% labeled data, our proposed approach achieves comparable performance with 31% labeled data and obtains 5% absolute F1 gain with 50% labeled data.