{"id":624426,"date":"2019-12-04T08:59:15","date_gmt":"2019-12-04T16:59:15","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=624426"},"modified":"2019-12-04T09:06:13","modified_gmt":"2019-12-04T17:06:13","slug":"metalearned-neural-memory-teaching-neural-networks-how-to-remember","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/metalearned-neural-memory-teaching-neural-networks-how-to-remember\/","title":{"rendered":"Metalearned Neural Memory: Teaching neural networks how to remember"},"content":{"rendered":"
(opens in new tab)<\/span><\/a>Memory is an important part of human intelligence and the human experience. It grounds us in the current moment, helping us understand where we are and, consequently, what we should do next. Consider the simple example of reading a book. The ultimate goal is to understand the story, and memory is the reason we\u2019re able to do so. Memory allows us to efficiently store the information we encounter and later recall the details we\u2019ve previously read, whether that be moments earlier or weeks, to piece together the full narrative. Memory is equally important in deep learning, especially when the goal is to create models with advanced capabilities. In the fields of natural language understanding and processing, for example, memory is crucial for modeling long-term dependencies and building representations of partially observable states.<\/p>\n In a paper (opens in new tab)<\/span><\/a> published at the 33rd Conference on Neural Information Processing Systems (NeurIPS) (opens in new tab)<\/span><\/a>, we demonstrate how to use a deep neural network itself as a memory mechanism. We propose a new model, Metalearned Neural Memory (MNM), in which we store data in the parameters of a deep network and use the function defined by that network to recall the data.<\/p>\n Deep networks\u2014powerful and flexible function approximators capable of generalizing from training data or memorizing it\u2014have seen limited use as memory modules, as writing information into network parameters is slow. Deep networks require abundant data and many steps of gradient descent to learn. Fortunately, recent progress in few-shot learning and metalearning has shown how we might overcome this challenge. Methods from these fields can discover update procedures that optimize neural parameters from many fewer examples than standard stochastic gradient descent. It\u2019s through metalearning techniques that MNM learns to remember: It learns how to read from and write to memory, as opposed to using hard-coded read\/write operations like most existing computational memory mechanisms.<\/span><\/p>\n