{"id":795461,"date":"2021-11-16T08:00:29","date_gmt":"2021-11-16T16:00:29","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=795461"},"modified":"2021-11-16T12:53:26","modified_gmt":"2021-11-16T20:53:26","slug":"research-talk-deepxml-a-deep-extreme-classification-framework-for-recommending-millions-of-items","status":"publish","type":"msr-video","link":"https:\/\/www.microsoft.com\/en-us\/research\/video\/research-talk-deepxml-a-deep-extreme-classification-framework-for-recommending-millions-of-items\/","title":{"rendered":"Research talk: DeepXML: A deep extreme classification framework for recommending hundreds of millions of items"},"content":{"rendered":"
Extreme classification provides a formulation for large-scale ranking and recommendation problems by treating each item to be ranked or recommended as a separate label in a multi-label classification problem. Scalability and accuracy are well-recognized challenges in deep extreme classification where the objective is to train feature architectures like BERT, GPT-3 jointly with the classifiers for items. This talk will introduce the DeepXML framework that addresses these challenges by decomposing the deep extreme multi-label learning task into four simpler sub-tasks, each of which can be trained accurately and efficiently. Choosing different components for the four sub-tasks allows DeepXML to generate a family of accurate and scalable algorithms geared towards different scenarios. In particular, algorithms derived from the DeepXML framework can be 10\u201330 percent more accurate and up to 3\u201397x faster to train than leading deep extreme classifiers on publicly available datasets. These algorithms can be efficiently trained on Bing datasets containing hundreds of millions of labels while making predictions for billions of users and data points per day on commodity hardware. This allowed DeepXML to yield significant gains in click-through rates, coverage, revenue, and other online metrics over state-of-the-art techniques currently in production for several Microsoft internal applications, ranging from matching user query to advertiser bid phrases to recommending retail products to showing personalized ads.<\/p>\n