{"id":552894,"date":"2018-11-30T07:56:26","date_gmt":"2018-11-30T15:56:26","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=552894"},"modified":"2018-11-30T07:56:26","modified_gmt":"2018-11-30T15:56:26","slug":"discovering-the-best-neural-architectures-in-the-continuous-space","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/discovering-the-best-neural-architectures-in-the-continuous-space\/","title":{"rendered":"Discovering the best neural architectures in the continuous space"},"content":{"rendered":"

\"Discovering<\/p>\n

If you\u2019re a deep learning practitioner, you may find yourself faced with the same critical question on a regular basis: Which neural network architecture should I choose for my current task? The decision depends on a variety of factors and the answers to a number of other questions. What operations should I choose for this layer\u2014convolution, depth separable convolution, or max pooling? What is the kernel size for convolution? 3×3 or 1×1? And which previous node should serve as the input to the current recurrent neural network (RNN) cell? Such decisions are crucial to the architecture\u2019s success. If you\u2019re a domain expert in both neural network modeling and the specific task at hand, it might be easy for you to make the correct decisions. But what if your experience with either of them is limited?<\/p>\n

In that case, you might turn to neural architecture search (NAS), an automated process in which an additional machine learning algorithm is leveraged to guide the creation of better neural architecture given the historically observed architectures and their performances. Thanks to NAS, we can pinpoint the neural network architectures that achieve the best results on widely used benchmark datasets, such as ImageNet, without any human intervention.<\/p>\n

But while existing methods in automatic neural architecture design\u2014typically based on reinforcement learning or evolutionary algorithm\u2014have generally conducted their searches in the exponentially large discrete space, my collaborators and I in the Machine Learning group at Microsoft Research Asia (opens in new tab)<\/span><\/a> have designed a simplified, more efficient method based on optimization in the continuous space. With our new approach, called neural architecture optimization (NAO) (opens in new tab)<\/span><\/a>, we leverage the power of a gradient-based method to conduct optimization in the more compact space. The work is part of the program at this year\u2019s Conference on Neural Information Processing Systems (NeurIPS) (opens in new tab)<\/span><\/a>.<\/p>\n

\"Figure

Figure 1: The workflow of NAO<\/p><\/div>\n

The key components of NAO<\/h3>\n

Driving NAO\u2019s ability to perform gradient-based optimization in the continuous space are three components (see Figure 1):<\/p>\n