{"id":641181,"date":"2020-03-05T14:24:43","date_gmt":"2020-03-05T22:24:43","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-blog-post&p=641181"},"modified":"2020-03-11T08:24:35","modified_gmt":"2020-03-11T15:24:35","slug":"multi-sense-network-representation-learning-in-microsoft-academic-graph","status":"publish","type":"msr-blog-post","link":"https:\/\/www.microsoft.com\/en-us\/research\/articles\/multi-sense-network-representation-learning-in-microsoft-academic-graph\/","title":{"rendered":"Multi-Sense Network Representation Learning in Microsoft Academic Graph"},"content":{"rendered":"
Over the past few years deep representation learning (opens in new tab)<\/span><\/a> has revolutionized the developments of various domains, including natural language processing (NLP) (opens in new tab)<\/span><\/a>, computer vision (opens in new tab)<\/span><\/a>, and speech (opens in new tab)<\/span><\/a>. For example, in the NLP domain representation learning aims to learn contextual embeddings for tokens\/words (opens in new tab)<\/span><\/a> such that “words that occur in the same contexts tend to have similar meanings”. The distributional hypothesis that was first proposed by Harris in 1954 (opens in new tab)<\/span><\/a>. The representation learning idea has also been extended to networks (opens in new tab)<\/span><\/a>, in which vertices that have the same structural contexts have the tendency to be similar.<\/p>\n Existing representation learning techniques use only one embedding vector for each token\/node that may actually have different meanings under different contexts. This fundamental issue leads to the need of using more complicated models, such as ELMo (opens in new tab)<\/span><\/a> and Transformers (opens in new tab)<\/span><\/a>, to try to recapture the contextual information for each customized context because one single vector is not enough to capture the contextual differences in both natural language and network structures. This issue could get worse when the network structures are organized in a heterogeneous way, which is the nature of the Microsoft Academic Graph (MAG) (opens in new tab)<\/span><\/a>, in which the structural contexts are naturally diverse in observation of different types of entities and their relationships.<\/p>\n For additional context it’s important to review how representation learning has shaped network mining (opens in new tab)<\/span><\/a> and to demonstrate why one embedding vector is not enough to model different structural contexts in MAG. The traditional paradigm of mining and learning with networks (opens in new tab)<\/span><\/a> usually begins with the discovery of networks\u2019 structural properties. With these structural properties extracted as features, machine learning (opens in new tab)<\/span><\/a> algorithms can be applied for various of applications. Often, however, the characterization of these features involves domain knowledge and expensive computation. The emergence of representation learning on networks (opens in new tab)<\/span><\/a> offers new perspectives to address this issue by translating discrete and structural symbols into continuous representations such as low-dimensional vectors, that computers can “understand” and process algebraically.<\/p>\n The Microsoft Academic Graph (MAG) is a prime example of a network that can benefit from these recent advances in network representation learning. To illustrate, pretend there are two scholars who are extensively working on machine learning. One of them publishes all their papers in the ICML (opens in new tab)<\/span><\/a> conference and the other one exclusively has papers published at the NeurIPS (opens in new tab)<\/span><\/a> conference. Intuitively, these two scholars are considered to be very similar in light of the strong similarity between the ICML (opens in new tab)<\/span><\/a> and NeurIPS (opens in new tab)<\/span><\/a> conferences. However, in the discrete space they have not published in the same venue, meaning zero similarity between them which is quite counter-intuitive. This issue can be addressed by computing the similarity between their representations in the latent continuous space.<\/p>\n Learning representations for MAG is more complex due to it being a heterogeneous network consisting of different types of entities (publications, authors, venues, affiliations, fields of study) with various types of relationships between them (publication relation between papers and authors, the citation relation between papers, etc.) You can see the heterogeneous network of MAG (opens in new tab)<\/span><\/a> illustrated in the left of the figure below, and its five types of meta relations are introduced in the right:<\/p>\n <\/p>\n The premise of network representation learning is to map the network structures into latent continuous space such that the structural relations between entities can be embedded. In heterogeneous networks there exist various structural relations corresponding to different semantic similarities. For example, the two scholars mentioned earlier are similar to each other in the sense of their publication venues. Their similarity can also be measured through other senses: scientific collaborations, research topics, and in combinations of all other senses due to MAG being strongly connected.<\/p>\n The core question we must answer here is how to define and encode different senses of similarities in MAG. To address this we produce multi-sense network similarities for MAG, each of which corresponds to one semantic sense in the academic domain. The general idea is to project the heterogeneous structure of MAG into homogeneous structures according to different semantic senses and to learn entity representations for each of them.<\/p>\n We are happy to announce that users can now access these multi-sense MAG network embeddings and similarity computation functions with the Network Similarity Package (NSP) (opens in new tab)<\/span><\/a>, an optional utility available as part of the larger MAG package (opens in new tab)<\/span><\/a>. Note that the NSP is not included in basic MAG distribution and must be specifically requested when signing up to receive MAG.<\/p>\n The senses of entity embeddings that are currently available in NSP include:<\/p>\n