{"id":584920,"date":"2019-05-08T09:59:17","date_gmt":"2019-05-08T16:59:17","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=584920"},"modified":"2019-06-26T14:16:42","modified_gmt":"2019-06-26T21:16:42","slug":"spacefusion-structuring-the-unstructured-latent-space-for-conversational-ai","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/spacefusion-structuring-the-unstructured-latent-space-for-conversational-ai\/","title":{"rendered":"SpaceFusion: Structuring the unstructured latent space for conversational AI"},"content":{"rendered":"

A palette makes it easy for painters to arrange and mix paints of different colors as they create art on the canvas before them. Having a similar tool that could allow AI to jointly learn from diverse data sources such as those for conversations, narratives, images, and knowledge could open doors for researchers and scientists to develop AI systems capable of more general intelligence.<\/p>\n

\"A<\/a>

A palette allows a painter to arrange and mix paints of different colors. SpaceFusion seeks to help AI scientists do similar things for different models trained on different datasets.<\/p><\/div>\n

For deep learning models today, datasets are usually represented by vectors in different latent spaces using different neural networks. In the paper \u201cJointly Optimizing Diversity and Relevance in Neural Response Generation<\/a>,\u201d my co-authors and I propose SpaceFusion, a learning paradigm to align these different latent spaces\u2014arrange and mix them smoothly like the paint on a palette\u2014so AI can leverage the patterns and knowledge embedded in each of them. This work, which we\u2019re presenting at the 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT)<\/a>, is part of the Data-Driven Conversation<\/a> project, and an implementation of it is available on GitHub<\/a>.<\/p>\n

Capturing the color of human conversation<\/h3>\n

As a first attempt, we applied this technique to neural conversational AI. In our setup, a neural model is expected to generate relevant and interesting responses given a conversation history, or context. While promising advances in neural conversation models have been made, these models tend to play it safe, producing generic and dull responses. Approaches have been developed to diversify these responses and better capture the color of human conversation, but oftentimes, there is a tradeoff, with relevancy declining<\/a>.<\/p>\n

\"Figure<\/a>

Figure 1: Like a palette allows for the easy combination of paints, SpaceFusion aligns, or mixes, the latent spaces learned from a sequence-to-sequence (S2S, red<\/span> dots) model and an autoencoder (AE, blue<\/span> dots) to jointly utilize the two models more efficiently.<\/p><\/div>\n

SpaceFusion tackles this problem by aligning the latent spaces learned from two models (Figure 1):<\/p>\n