Embedding Imputation with Grounded Language Information.

Ziyi Yang; Chenguang Zhu; Vin Sachidananda; Eric Darve

Embedding Imputation with Grounded Language Information.

Ziyi Yang ,
Chenguang Zhu ,
Vin Sachidananda ,
Eric Darve

2019 Meeting of the Association for Computational Linguistics | May 2019

Published by Association for Computational Linguistics

Download BibTex

Due to the ubiquitous use of embeddings as input representations for a wide range of natural language tasks, imputation of embeddings for rare and unseen words is a critical problem in language processing. Embedding imputation involves learning representations for rare or unseen words during the training of an embedding model, often in a post-hoc manner. In this paper, we propose an approach for embedding imputation which uses grounded information in the form of a knowledge graph. This is in contrast to existing approaches which typically make use of vector space properties or subword information. We propose an online method to construct a graph from grounded information and design an algorithm to map from the resulting graphical structure to the space of the pre-trained embeddings. Finally, we evaluate our approach on a range of rare and unseen word tasks across various domains and show that our model can learn better representations. For example, on the Card-660 task our method improves Pearson’s and Spearman’s correlation coefficients upon the state-of-the-art by 11% and 17.8% respectively using GloVe embeddings.