{"id":641841,"date":"2020-03-17T08:56:42","date_gmt":"2020-03-17T15:56:42","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=641841"},"modified":"2020-03-17T09:10:52","modified_gmt":"2020-03-17T16:10:52","slug":"training-deep-control-policies-for-the-real-world","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/training-deep-control-policies-for-the-real-world\/","title":{"rendered":"Training deep control policies for the real world"},"content":{"rendered":"

\"drone<\/p>\n

Humans subconsciously use perception-action loops to do just about everything, from walking down a crowded sidewalk to scoring a goal in a community soccer league. Perception-action loops\u2014using sensory input to decide on appropriate action in a continuous real time loop \u2014are at the heart of autonomous systems. Although this tech has advanced dramatically in the ability to use sensors and cameras to reason about control actions, the current generation of autonomous systems are still nowhere near human skill in making those decisions directly from visual data. Here, we share how we have built Machine Learning systems that reason out correct actions to take directly from camera images. The system is trained via simulations and learns to independently navigate challenging environments and conditions in real world, including unseen situations.<\/p>\n

Read the Paper<\/a>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 Download the Code<\/a>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 Watch the Video<\/a><\/p>\n

We wanted to push current technology to get closer to a human\u2019s ability to interpret environmental cues, adapt to difficult conditions and operate autonomously. For example, in First Person View (FPV) drone racing, expert pilots can plan and control a quadrotor with high agility using a noisy monocular camera feed, without compromising safety. We were interested in exploring the question of what it would take to build autonomous systems that achieve similar performance levels. We trained deep neural nets on simulated data and deployed the learned models in real-world environments. Our framework explicitly separates the perception components (making sense of what you see) from the control policy (deciding what to do based on what you see). This two-stage approach helps researchers interpret and debug the deep neural models, which is hard to do with full end-to-end learning.<\/p>\n

The ability to efficiently solve such perception-action loops with deep neural networks can have significant impact on real-world systems. Examples include our collaboration with researchers at Carnegie Mellon University and Oregon State University, collectively named Team Explorer, on the DARPA Subterranean (SubT) Challenge. The DARPA challenge centers on assisting first responders and those who lead search and rescue missions, especially in hazardous physical environments, to more quickly identify people in need of help.<\/p>\n