{"id":171259,"date":"2014-01-07T17:35:54","date_gmt":"2014-01-07T17:35:54","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/project\/synonym-mining\/"},"modified":"2018-07-19T11:58:46","modified_gmt":"2018-07-19T18:58:46","slug":"synonym-mining","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/synonym-mining\/","title":{"rendered":"Synonym Mining"},"content":{"rendered":"

The same entity is often referred to in a variety of ways. For example, the camera Canon 600d is also referred to as “canon rebel t3i”, the celebrity Jennifer Lopez is also referred to as “jlo” and Seattle Tacoma International Airport is also referred to as “sea tac”. These are known as synonyms. Without knowledge of synonyms, many applications like e-commerce search will fail to return relevant results. We leverage the data assets amassed by Bing to automatically mine such synonyms.<\/p>\n

\n

One of the main insights is to use Bing’s query log to mine synonyms. However, simple techniques like using co-click frequencies are not adequate; we developed new features like pseudo-document similarity and context similarity. Furthermore, we leverage other data sources like web lists, web tables and certain text patterns to mine synonyms.<\/p>\n

Impact<\/h2>\n

Our synonym research had tremendous impact of several Microsoft products and services over the years:<\/p>\n