{"id":641181,"date":"2020-03-05T14:24:43","date_gmt":"2020-03-05T22:24:43","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-blog-post&#038;p=641181"},"modified":"2020-03-11T08:24:35","modified_gmt":"2020-03-11T15:24:35","slug":"multi-sense-network-representation-learning-in-microsoft-academic-graph","status":"publish","type":"msr-blog-post","link":"https:\/\/www.microsoft.com\/en-us\/research\/articles\/multi-sense-network-representation-learning-in-microsoft-academic-graph\/","title":{"rendered":"Multi-Sense Network Representation Learning in Microsoft Academic Graph"},"content":{"rendered":"<p>Over the past few years <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/topic\/59404180\">deep representation learning<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> has revolutionized the developments of various domains, including <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/topic\/204321447\">natural language processing (NLP)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/topic\/31972630\">computer vision<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/topic\/28490314\">speech<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. For example, in the NLP domain representation learning aims to learn <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/topic\/2776461190\">contextual embeddings for tokens\/words<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> such that &#8220;words that occur in the same contexts tend to have similar meanings&#8221;. The distributional hypothesis that was first proposed by <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/paper\/2882319491\">Harris in 1954<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. The representation learning idea has also been extended to <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/paper\/2761896323\">networks<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, in which vertices that have the same structural contexts have the tendency to be similar.<\/p>\n<p>Existing representation learning techniques use only one embedding vector for each token\/node that may actually have different meanings under different contexts. This fundamental issue leads to the need of using more complicated models, such as <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/paper\/2962739339\">ELMo<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/paper\/2963403868\">Transformers<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, to try to recapture the contextual information for each customized context because one single vector is not enough to capture the contextual differences in both natural language and network structures. This issue could get worse when the network structures are organized in a heterogeneous way, which is the nature of the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/paper\/1932742904\">Microsoft Academic Graph (MAG)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, in which the structural contexts are naturally diverse in observation of different types of entities and their relationships.<\/p>\n<p>For additional context it&#8217;s important to review how representation learning has shaped <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/topic\/137753397\">network mining<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and to demonstrate why one embedding vector is not enough to model different structural contexts in MAG. The traditional paradigm of <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/topic\/114713312\">mining and learning with networks<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> usually begins with the discovery of networks\u2019 structural properties. With these structural properties extracted as features, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/topic\/114713312\">machine learning<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> algorithms can be applied for various of applications. Often, however, the characterization of these features involves domain knowledge and expensive computation. The emergence of <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/topic\/2988435680\">representation learning on networks<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> offers new perspectives to address this issue by translating discrete and structural symbols into continuous representations such as low-dimensional vectors, that computers can &#8220;understand&#8221; and process algebraically.<\/p>\n<p>The Microsoft Academic Graph (MAG) is a prime example of a network that can benefit from these recent advances in network representation learning. To illustrate, pretend there are two scholars who are extensively working on machine learning. One of them publishes all their papers in the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/conference\/1180662882\">ICML<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> conference and the other one exclusively has papers published at the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/conference\/1127325140\">NeurIPS<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> conference. Intuitively, these two scholars are considered to be very similar in light of the strong similarity between the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/conference\/1180662882\">ICML<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/conference\/1127325140\">NeurIPS<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> conferences. However, in the discrete space they have not published in the same venue, meaning zero similarity between them which is quite counter-intuitive. This issue can be addressed by computing the similarity between their representations in the latent continuous space.<\/p>\n<p>Learning representations for MAG is more complex due to it being a heterogeneous network consisting of different types of entities (publications, authors, venues, affiliations, fields of study) with various types of relationships between them (publication relation between papers and authors, the citation relation between papers, etc.) You can see the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/arxiv.org\/pdf\/2003.01332.pdf\">heterogeneous network of MAG<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> illustrated in the left of the figure below, and its five types of meta relations are introduced in the right:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-641190\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/03\/multi-sense-01.png\" alt=\"\" width=\"1189\" height=\"262\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/03\/multi-sense-01.png 1189w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/03\/multi-sense-01-300x66.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/03\/multi-sense-01-1024x226.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/03\/multi-sense-01-768x169.png 768w\" sizes=\"(max-width: 1189px) 100vw, 1189px\" \/><\/p>\n<p>The premise of network representation learning is to map the network structures into latent continuous space such that the structural relations between entities can be embedded. In heterogeneous networks there exist various structural relations corresponding to different semantic similarities. For example, the two scholars mentioned earlier are similar to each other in the sense of their publication venues. Their similarity can also be measured through other senses: scientific collaborations, research topics, and in combinations of all other senses due to MAG being strongly connected.<\/p>\n<p>The core question we must answer here is how to define and encode different senses of similarities in MAG. To address this we produce multi-sense network similarities for MAG, each of which corresponds to one semantic sense in the academic domain. The general idea is to project the heterogeneous structure of MAG into homogeneous structures according to different semantic senses and to learn entity representations for each of them.<\/p>\n<p>We are happy to announce that users can now access these multi-sense MAG network embeddings and similarity computation functions with the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/docs.microsoft.com\/en-us\/academic-services\/graph\/get-started-setup-provisioning\">Network Similarity Package (NSP)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, an optional utility available as part of the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/docs.microsoft.com\/en-us\/academic-services\/graph\/get-started-setup-provisioning\">larger MAG package<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. Note that the NSP is not included in basic MAG distribution and must be specifically requested when signing up to receive MAG.<\/p>\n<p>The senses of entity embeddings that are currently available in NSP include:<\/p>\n<table class=\"msr-table-default\" style=\"height: 140px;width: 100%;border-spacing: inherit;border-collapse: separate;border-style: solid;border-color: #000000\" border=\"inherit\" cellspacing=\"inherit\" cellpadding=\"inherit\">\n<thead>\n<tr style=\"height: 28px;border-color: #000000\">\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\"><strong>Entity type<\/strong><\/td>\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\"><strong>Sense 1<\/strong><\/td>\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\"><strong>Sense 2<\/strong><\/td>\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\"><strong>Sense 3<\/strong><\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr style=\"height: 28px\">\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\">affiliation<\/td>\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\">copaper<\/td>\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\">covenue<\/td>\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\">metapath<\/td>\n<\/tr>\n<tr style=\"height: 28px\">\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\">venue<\/td>\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\">coauthor<\/td>\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\">cofos<\/td>\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\">metapath<\/td>\n<\/tr>\n<tr style=\"height: 28px\">\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\">field of study<\/td>\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\">copaper<\/td>\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\">covenue<\/td>\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\">metapath<\/td>\n<\/tr>\n<tr style=\"height: 28px\">\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\">author<\/td>\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\">copaper<\/td>\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\"><\/td>\n<td style=\"height: 28px;padding: inherit;border: inherit\" width=\"156\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<p>The description for each sense:<\/p>\n<table class=\"msr-table-default\" style=\"border-spacing: inherit;border-collapse: separate;border-style: solid;border-color: #000000\" border=\"inherit\" cellspacing=\"1em\" cellpadding=\"inherit\">\n<thead>\n<tr>\n<td style=\"padding: inherit;border: inherit;width: 114px\" width=\"114\"><strong>Entity type<\/strong><\/td>\n<td style=\"padding: inherit;border: inherit;width: 97px\" width=\"78\"><strong>Sense<\/strong><\/td>\n<td style=\"padding: inherit;border: inherit;width: 413px\" width=\"432\"><strong>Description<\/strong><\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"padding: inherit;border: inherit;width: 114px\" width=\"114\">affiliation<\/td>\n<td style=\"padding: inherit;border: inherit;width: 97px\" width=\"78\">copaper<\/td>\n<td style=\"padding: inherit;border: inherit;width: 413px\" width=\"432\">Two affiliations are similar if they are closed connected with each other in the weighted affiliation collaboration graph.<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit;width: 114px\" width=\"114\">affiliation<\/td>\n<td style=\"padding: inherit;border: inherit;width: 97px\" width=\"78\">covenue<\/td>\n<td style=\"padding: inherit;border: inherit;width: 413px\" width=\"432\">Two affiliations are similar if they publish in similar venues (journals and conferences).<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit;width: 114px\" width=\"114\">affiliation<\/td>\n<td style=\"padding: inherit;border: inherit;width: 97px\" width=\"78\">metapath<\/td>\n<td style=\"padding: inherit;border: inherit;width: 413px\" width=\"432\">Two affiliations are similar if they co-occur with common affiliations, venues, and fields of study.<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit;width: 114px\" width=\"114\">venue<\/td>\n<td style=\"padding: inherit;border: inherit;width: 97px\" width=\"78\">coauthor<\/td>\n<td style=\"padding: inherit;border: inherit;width: 413px\" width=\"432\">Two venues are similar if they publish papers with common authors.<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit;width: 114px\" width=\"114\">venue<\/td>\n<td style=\"padding: inherit;border: inherit;width: 97px\" width=\"78\">cofos<\/td>\n<td style=\"padding: inherit;border: inherit;width: 413px\" width=\"432\">Two venues are similar if they publish papers with similar fields of study.<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit;width: 114px\" width=\"114\">venue<\/td>\n<td style=\"padding: inherit;border: inherit;width: 97px\" width=\"78\">metapath<\/td>\n<td style=\"padding: inherit;border: inherit;width: 413px\" width=\"432\">Two venues are similar if they co-occur with common affiliations, venues, and fields of study.<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit;width: 114px\" width=\"114\">field of study<\/td>\n<td style=\"padding: inherit;border: inherit;width: 97px\" width=\"78\">copaper<\/td>\n<td style=\"padding: inherit;border: inherit;width: 413px\" width=\"432\">Two fields of study are similar if they appear in the same paper.<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit;width: 114px\" width=\"114\">field of study<\/td>\n<td style=\"padding: inherit;border: inherit;width: 97px\" width=\"78\">covenue<\/td>\n<td style=\"padding: inherit;border: inherit;width: 413px\" width=\"432\">Two fields of study are similar if they have papers from similar venues.<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit;width: 114px\" width=\"114\">field of study<\/td>\n<td style=\"padding: inherit;border: inherit;width: 97px\" width=\"78\">metapath<\/td>\n<td style=\"padding: inherit;border: inherit;width: 413px\" width=\"432\">Two fields of study are similar if they co-occur with common affiliations, venues, and fields of study.<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit;width: 114px\" width=\"114\">author<\/td>\n<td style=\"padding: inherit;border: inherit;width: 97px\" width=\"78\">copaper<\/td>\n<td style=\"padding: inherit;border: inherit;width: 413px\" width=\"432\">Two authors are similar if they are closed connected with each other in the weighted author collaboration graph.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<p>Using the journal \u201c<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/137773608\">Nature<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>\u201d as an example, the top five most similar venues under the three senses are quite different:<\/p>\n<table class=\"msr-table-default\" style=\"border-spacing: inherit;border-collapse: separate;border-style: solid;border-color: #000000;width: 624px;height: 240px\" border=\"inherit\" cellspacing=\"inherit\" cellpadding=\"inherit\">\n<thead>\n<tr style=\"height: 24px\">\n<td style=\"padding: inherit;border: inherit;width: 57px;height: 24px\" width=\"48\"><strong>Rank<\/strong><\/td>\n<td style=\"padding: inherit;border: inherit;width: 223px;height: 24px\" width=\"210\"><strong>Coauthor<\/strong><\/td>\n<td style=\"padding: inherit;border: inherit;width: 152px;height: 24px\" width=\"126\"><strong>Cofos<\/strong><\/td>\n<td style=\"padding: inherit;border: inherit;width: 192px;height: 24px\" width=\"240\"><strong>Metapath<\/strong><\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr style=\"height: 24px\">\n<td style=\"padding: inherit;border: inherit;width: 57px;height: 24px\" width=\"48\">1<\/td>\n<td style=\"padding: inherit;border: inherit;width: 223px;height: 24px\" width=\"210\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/3880285\">Science<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/td>\n<td style=\"padding: inherit;border: inherit;width: 152px;height: 24px\" width=\"126\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/3880285\">Science<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/td>\n<td style=\"padding: inherit;border: inherit;width: 192px;height: 24px\" width=\"240\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/3880285\">Science<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/td>\n<\/tr>\n<tr style=\"height: 72px\">\n<td style=\"padding: inherit;border: inherit;width: 57px;height: 72px\" width=\"48\">2<\/td>\n<td style=\"padding: inherit;border: inherit;width: 223px;height: 72px\" width=\"210\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/29882555\">Annual Review of Astronomy and Astrophysics<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/td>\n<td style=\"padding: inherit;border: inherit;width: 152px;height: 72px\" width=\"126\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/9683234\">Science Progress<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/td>\n<td style=\"padding: inherit;border: inherit;width: 192px;height: 72px\" width=\"240\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/125754415\">PNAS<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/td>\n<\/tr>\n<tr style=\"height: 48px\">\n<td style=\"padding: inherit;border: inherit;width: 57px;height: 48px\" width=\"48\">3<\/td>\n<td style=\"padding: inherit;border: inherit;width: 223px;height: 48px\" width=\"210\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/160223955\">Cold Spring Harbor Perspectives in Biology<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/td>\n<td style=\"padding: inherit;border: inherit;width: 152px;height: 48px\" width=\"126\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/2737427234\">Science Advances<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/td>\n<td style=\"padding: inherit;border: inherit;width: 192px;height: 48px\" width=\"240\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/64187185\">Nature Communications<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/td>\n<\/tr>\n<tr style=\"height: 48px\">\n<td style=\"padding: inherit;border: inherit;width: 57px;height: 48px\" width=\"48\">4<\/td>\n<td style=\"padding: inherit;border: inherit;width: 223px;height: 48px\" width=\"210\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/91660768\">Philosophical Transactions of the Royal Society B<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/td>\n<td style=\"padding: inherit;border: inherit;width: 152px;height: 48px\" width=\"126\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/2764746627\">Access Science<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/td>\n<td style=\"padding: inherit;border: inherit;width: 192px;height: 48px\" width=\"240\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/91660768\">Philosophical Transactions of the Royal Society B<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/td>\n<\/tr>\n<tr style=\"height: 24px\">\n<td style=\"padding: inherit;border: inherit;width: 57px;height: 24px\" width=\"48\">5<\/td>\n<td style=\"padding: inherit;border: inherit;width: 223px;height: 24px\" width=\"210\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/110447773\">Cell<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/td>\n<td style=\"padding: inherit;border: inherit;width: 152px;height: 24px\" width=\"126\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/125754415\">PNAS<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/td>\n<td style=\"padding: inherit;border: inherit;width: 192px;height: 24px\" width=\"240\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/100460328\">Scientific American<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<p>The NSP supports both <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/docs.microsoft.com\/en-us\/academic-services\/graph\/network-similarity-analytics\">U-SQL<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/docs.microsoft.com\/en-us\/academic-services\/graph\/network-similarity-databricks\">PySpark<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.\u00a0 Using PySpark as an example, creating an NSP instance and loading network embeddings is as easy as:<\/p>\n<p style=\"text-align: center\">ns = NetworkSimilarity(container=MagContainer, account=AzureStorageAccount, key=AzureStorageAccessKey, resource=ResourcePath)<\/p>\n<p>Note that the \u201cresource\u201d parameter specifies the desired sense.<\/p>\n<p>To retrieve the raw similarity score between two entities (i.e., EntityId1, EntityId2) after initializing an NSP instance for a specific sense, use the \u201cgetSimilarity()\u201d method:<\/p>\n<p style=\"text-align: center\">score = ns.getSimilarity(EntityId1, EntityId2)<\/p>\n<p>If you would like to get the top k most similar item to one particular entity (i.e. EntityId1) under this sense, you call the \u201cgetTopEntities()\u201d function.<\/p>\n<p style=\"text-align: center\">topEntities = ns.getTopEntities(EntityId1)<\/p>\n<p>The \u201cmetapath\u201d sense combines all different types of senses together and has been powering the related journal function on <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\">Microsoft Academic<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> for some time now. For example, on the entity detail page (EDP) for <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/journal\/137773608\">the journal &#8220;Nature&#8221;<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> :<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-641193\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/03\/multi-sense-02.png\" alt=\"\" width=\"1147\" height=\"839\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/03\/multi-sense-02.png 1147w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/03\/multi-sense-02-300x219.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/03\/multi-sense-02-1024x749.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/03\/multi-sense-02-768x562.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/03\/multi-sense-02-80x60.png 80w\" sizes=\"(max-width: 1147px) 100vw, 1147px\" \/><\/p>\n<p>For more detailed instructions on using NSP please see the following two samples:<\/p>\n<ul>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/docs.microsoft.com\/academic-services\/graph\/network-similarity-databricks\">NSP for PySpark<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/docs.microsoft.com\/academic-services\/graph\/network-similarity-analytics\">NSP for U-SQL<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<\/ul>\n<h2>Introducing Network Representation Learning Techniques<\/h2>\n<p>Given MAG as input, we first define the semantic senses of network similarities based on how we project the heterogeneous structures into corresponding homogeneous network structures. For example, for the \u201ccopaper\u201d sense for affiliations, we construct the affiliation collaboration network from MAG, in which two affiliations are connected with each other if both of them appear in the same paper. Given the projected homogeneous affiliation collaboration network, we leverage the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/paper\/2914833637\">NetSMF algorithm (Qiu et al., WWW 2019)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>\u00a0 to learn the co-paper sense embeddings for affiliation entities. We decided to use NetSMF as our network presentation learning algorithm of choice because of its ability to learn representations for billion-scale homogeneous networks on a modern single-node computers. Network embeddings for \u201ccopaper\u201d, \u201ccovenue\u201d, \u201ccoauthor\u201dand \u201ccofos\u201d were all trained using this framework.<\/p>\n<p>For the \u201cmetapath\u201d sense, instead of projecting MAG into homogeneous networks we instead fully utilized its heterogeneous structures using meta path based random walks to convert non-Euclidean structures into entity sequences. Then we applied the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/paper\/2743104969\">metapath2vec algorithm<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> to encode the semantic relations underlying the structures into latent continuous embeddings. This sense aims to capture the co-occurrence of all types of entities based on the heterogeneous structure of the MAG network.<\/p>\n<p>For example, the similarity between the \u201cNature\u201d and \u201cScience\u201d journals encodes their co-occurrence frequency together with other venues, fields of study, affiliations, and authors.<\/p>\n<p>The current version of multi-sense network similarity is based solely on the heterogeneous network structures of MAG and reflects the structural semantics underlying the network organization. It&#8217;s important to note however that it does not currently cover language semantics, which is another important and unique property of MAG. Language semantics are critical, as scientific innovations are actually carried by the text of each publication. To understand how language semantics plays a role in MAG, please refer to the <a href=\"https:\/\/www.microsoft.com\/research\/project\/academic\/articles\/understanding-documents-by-using-semantics\/\">Language Similarity Package<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<p>Network representation learning is a new frontier and one that our team is committed to exploring.\u00a0 We plan to continue making improvements to both MAG and more broadly to representation learning research. Our ongoing effort is in combining both the structural and language semantics into the entity representations by unleashing the power of <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/topic\/2989256011\">graph neural networks<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. To accomplish this we are developing a <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/topic\/2993807640\">self-attention<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> based heterogeneous graph neural network framework, also known as <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/arxiv.org\/pdf\/2003.01332.pdf\">Heterogeneous Graph Transformer (HGT)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, to learn unified entity representations with both structural and text information encoded in their embeddings. The technical detail of the HGT model will be presented in an upcoming <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/conference\/1135342153\">WWW<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> 2020 publication under the title of \u201cHeterogeneous Graph Transformer\u201d. Stay tuned!<\/p>\n<p>If you would like to receive the Network Similarity Package with your MAG distribution, please refer to the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/docs.microsoft.com\/academic-services\/graph\/network-similarity\">Network Similarity overview page<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<p>Happy researching!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Provides an overview of the new Network Similarity Package (NSP), an optional utility that is part of the Microsoft Academic Graph (MAG), providing multi-sense network representation<\/p>\n","protected":false},"author":36554,"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"msr-content-parent":170262,"footnotes":""},"research-area":[],"msr-locale":[268875],"class_list":["post-641181","msr-blog-post","type-msr-blog-post","status-publish","hentry","msr-locale-en_us"],"msr_assoc_parent":{"id":170262,"type":"project"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/641181"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-blog-post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/36554"}],"version-history":[{"count":31,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/641181\/revisions"}],"predecessor-version":[{"id":642345,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/641181\/revisions\/642345"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=641181"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=641181"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=641181"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}