Embedding Imputation with Grounded Language Information

  • Ziyi Yang ,
  • Chenguang Zhu ,
  • Vin Sachidananda ,
  • Eric Darve

The 57th Annual Meeting of the Association for Computational Linguistics (ACL) |

Due to the ubiquitous use of embeddings as input representations for a wide range of natural language tasks, imputation of embeddings for rare and unseen words is a critical problem in language processing. Embedding imputation involves learning representations for rare or unseen words during the training of an embedding model, often in a post-hoc manner. In this paper, we propose an approach for embedding imputation which uses grounded information in the form of a knowledge graph. This is in contrast to existing approaches which typically make use of vector space properties or subword information. We propose an online method to construct a graph from grounded information and design an algorithm to map from the resulting graphical structure to the space of the pre-trained embeddings. Finally, we evaluate our approach on a range of rare and unseen word tasks across various domains and show that our model can learn better representations. For example, it improves Pearson’s and Spearman’s correlation coefficients on Card-660 task by 7.7% and 6.7% respectively.