{"id":580783,"date":"2019-05-07T08:08:45","date_gmt":"2019-05-07T15:08:45","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=580783"},"modified":"2019-07-08T09:46:33","modified_gmt":"2019-07-08T16:46:33","slug":"autonomous-soaring-ai-on-the-fly","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/autonomous-soaring-ai-on-the-fly\/","title":{"rendered":"Autonomous soaring \u2013 AI on the fly"},"content":{"rendered":"<p><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-581533 size-large aligncenter\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788-1024x576.png\" alt=\"\" width=\"1024\" height=\"576\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788-1024x576.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788-1066x600.png 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788-343x193.png 343w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788.png 1400w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/p>\n<p>The past few years have seen tremendous progress in reinforcement learning (RL). From complex games to robotic object manipulation, RL has qualitatively advanced the state of the art. However, modern RL techniques require a lot for success: a largely deterministic stationary environment, an accurate resettable simulator in which mistakes \u2013 and especially their consequences \u2013 are limited to the virtual sphere, powerful computers, and a lot of energy to run them. At Microsoft Research, we are working towards automatic decision-making approaches that bring us closer to the vision of AI agents capable of learning and acting autonomously in changeable open-world conditions using the limited onboard compute. <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/project-frigatebird-ai-for-autonomous-soaring\/\">Project Frigatebird<\/a> is our ambitious quest in this space, aimed at building intelligence that can enable small fixed-wing uninhabited aerial vehicles (sUAVs) to stay aloft purely by extracting energy from moving air.<\/p>\n<h3>Let\u2019s talk hardware<\/h3>\n<p>Snipe 2, our latest sUAV, pictured above, exemplifies Project Frigatebird\u2019s hardware platforms. It is a small version of a special type of human-piloted aircraft known as sailplanes, also called gliders. Like many sailplanes, Snipe 2 doesn\u2019t have a motor; even sailplanes that do, carry just enough power to run it for only a minute or two. Snipe 2 is hand-tossed into the air to an altitude of approximately 60 meters and then slowly descends to the ground\u2014unless it finds a rising air current called a thermal (see <strong>Figure 2<\/strong>) and exploits it to soar higher. For human pilots in full-scale sailplanes, travelling hundreds of miles solely powered on these naturally occurring sources of lift is a popular sport. For certain birds like albatrosses or frigatebirds, covering great distances in this way with nary a wing flap is a natural-born skill. A skill that we would very much like to bestow on Snipe 2\u2019s AI.<\/p>\n<div id=\"attachment_581548\" style=\"width: 1034px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/open_world_figure_1.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-581548\" class=\"wp-image-581548 size-large\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/open_world_figure_1-1024x586.jpg\" alt=\"Figure 1: the layout of hardware for autonomous soaring in Snipe 2's narrow fuselage.\" width=\"1024\" height=\"586\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/open_world_figure_1-1024x586.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/open_world_figure_1-300x172.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/open_world_figure_1-768x440.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/open_world_figure_1.jpg 1429w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><p id=\"caption-attachment-581548\" class=\"wp-caption-text\">Figure 1: the layout of hardware for autonomous soaring in Snipe 2&#8217;s narrow fuselage.<\/p><\/div>\n<p>Snipe 2\u2019s 1.5 meter-wingspan airframe weighs a mere 163 grams, its slender fuselage only 35 mm wide at its widest spot. Yet it carries an off-the-shelf Pixhawk 4 Mini flight controller and all requisite peripherals for fully autonomous flight (see Figure 1.) This \u201cbrain\u201d has more than enough punch to run our Bayesian reinforcement learning-based soaring algorithm, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/autonomous-thermalling-as-a-partially-observable-markov-decision-process\/\">POMDSoar<\/a>. It can also receive a strategic, more computationally heavy, navigation policy over the radio from a laptop on the ground, further enhancing the sUAV\u2019s ability to find columns of rising air. Alternatively, Snipe 2 can house more powerful but still sufficiently compact hardware such as Raspberry Pi Zero to compute this policy onboard. Our larger sailplane drones like the 5-meter wingspan Thermik XXXL can carry even more sophisticated equipment, including cameras and a computational platform for processing their data in real time for hours on end. Indeed, nowadays the only barrier preventing winged drones from staying aloft for this long on atmospheric energy alone in favorable weather is the lack of sufficient AI capabilities.<\/p>\n<h3>Reaching higher<\/h3>\n<p>Why is building this intelligence hard? Exactly because of the factors that limit modern RL\u2019s applicability. Autopilots of conventional aircraft are built on fairly simple control-based approaches. This strategy works because an aircraft\u2019s motors, in combination with its wings, deliver a stable source of lift, allowing it to \u201coverpower\u201d most of variable factors affecting its flight, for example, wind. Sailplanes, on the other hand, are \u201cunderactuated\u201d and must make use of \u2013 not overpower \u2013 highly uncertain and non-stationary atmospheric phenomena to stay aloft. Thermals, the columns of upward-moving air in which hawks and other birds are often seen gracefully circling, are an example of these stochastic phenomena. A thermal can disappear minutes after appearing, and the amount of lift if provides varies across its lifecycle, with altitude, and with distance from the thermal center. Finding thermals is a difficult problem in itself. They cannot be seen directly; a sailplane can infer their size and location only approximately. Human pilots rely on local knowledge, ground features, observing the behavior of birds and other sailplanes, and other cues, in addition to instrument readings, to guess where thermals are. Interpreting some of these cues involves simple-sounding but nontrivial computer vision problems\u2014for example, estimating distance to objects seen against the background of featureless sky. Decision-making based on these observations is even more complicated. It requires integrating diverse sensor data on hardware far less capable than a human brain, and accounting for large amounts of uncertainty over large planning horizons. Accurately inferring the consequences of various decisions using simulations, a common approach in modern RL, is thwarted under these conditions by the lack of onboard compute and energy to run it.<\/p>\n<div id=\"attachment_581554\" style=\"width: 737px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/thermal-and-approximate-vertical-wind.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-581554\" class=\"wp-image-581554 size-full\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/thermal-and-approximate-vertical-wind.png\" alt=\"Figure 3: (Left) A schematic depiction of air movement within thermals and a sailplane's trajectory. (Right) A visualization of an actual thermal soaring trajectory from one of our sUAVs\u2019 flights.\" width=\"727\" height=\"319\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/thermal-and-approximate-vertical-wind.png 727w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/thermal-and-approximate-vertical-wind-300x132.png 300w\" sizes=\"auto, (max-width: 727px) 100vw, 727px\" \/><\/a><p id=\"caption-attachment-581554\" class=\"wp-caption-text\">Figure 3: (Left) A schematic depiction of air movement within thermals and a sailplane&#8217;s trajectory. (Right) A visualization of an actual thermal soaring trajectory from one of our sUAVs\u2019 flights.<\/p><\/div>\n<p>Our first steps have focused on using thermals to gain altitude:<\/p>\n<ul>\n<li>Our <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/autonomous-thermalling-as-a-partially-observable-markov-decision-process\/\">RSS-2018 paper<\/a> was the first autonomous soaring work to deploy an RL algorithm for exploiting thermals aboard an actual sailplane sUAV, as opposed to simulation. It also showed RL\u2019s advantage at this task over a strong baseline algorithm based on control and replanning, an instance of a class of autonomous thermaling approaches predominant in prior work, in a series of field tests. Our Bayesian RL algorithm POMDSoar deliberatively plans learning about the environment and exploiting the acquired knowledge. This property gives it an edge over more traditional soaring controllers that update their thermal model and adjust their thermaling strategy as they gather more data about the environment, but don\u2019t take intentional steps to optimize the information gathering.<\/li>\n<li>Our<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/ardusoar-an-open-source-thermalling-controller-for-resource-constrained-autopilots\/\"> IROS-2018 paper<\/a> studied ArduSoar, a control-based thermaling strategy. We have found it to perform very well given its approach that plans based on the current most likely thermal model. As a simple, robust soaring controller, ArduSoar has been integrated into <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/ardupilot.org\/plane\/\">ArduPlane<\/a>, a major open-source autopilot for fixed-wing drones.<\/li>\n<\/ul>\n\t<iframe\n\t\tsrc=\"https:\/\/ayvri.com\/embed\/eyj8r7ny5g\/cjl746v4r00023b6a4p5i4gez\"\n\t\twidth=\"800\"\n\t\theight=\"450\"\n\t\taria-label=\"\"\n\t\tallowfullscreen=\"true\">\n\t<\/iframe>\n\t\n<p>Figure 4: An animated 3D visualization of a real simultaneous flight of two motorized Radian Pro sailplanes, one running ArduSoar and another running POMDSoar. Use the mouse to change the viewing angle, zoom, and replay speed. At the end, one of the Radians can be seen engaging in low-altitude orographic soaring near a tree line, getting blown by a wind gust into a tree, and becoming stuck there roughly 35 meters above the ground \u2013 a reality of drone testing in the field. After some time, the Radian was retrieved from a nearby swamp and repaired. It flies to this day.<\/p>\n<p>We <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/Microsoft\/Frigatebird\">released both POMDSoar and ArduSoar<\/a> as part of <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/project-frigatebird-ai-for-autonomous-soaring\/\">Frigatebird autopilot on Github<\/a>, which is based on a fork of ArduPlane.<\/p>\n<h3>On a wing and a simulator<\/h3>\n<p>Although Project Frigatebird\u2019s goal is to take RL beyond simulated settings, simulations play a central role in the project. While working on POMDSoar and ArduSoar, we saved a lot of time by evaluating our ideas on a simulator in the lab before doing field tests. Besides saving time, simulators allow us to do crucial experiments that would be very difficult to do logistically in the field. This applies primarily to long-distance navigation, where simulation lets us learn and assess strategies over multi-kilometer distances over various types of terrain, conditions we don\u2019t have easy access to in reality.<\/p>\n<div id=\"attachment_581560\" style=\"width: 1034px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Silent_Wings.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-581560\" class=\"wp-image-581560 size-large\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Silent_Wings-1024x524.jpg\" alt=\"Figure 5: Software-in-the-loop simulation in Silent Wings. A Frigatebird-controlled LS-8b sailplane is trying to catch a thermal where another sailplane is already soaring on a windy day near Starmoen, Norway. For debugging convenience, Silent Wings indicates the centers of thermals and ridge lift, which are invisible in reality, with red arrows (this visualization can be disabled).\" width=\"1024\" height=\"524\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Silent_Wings-1024x524.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Silent_Wings-300x153.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Silent_Wings-768x393.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Silent_Wings.jpg 1429w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><p id=\"caption-attachment-581560\" class=\"wp-caption-text\">Figure 5: Software-in-the-loop simulation in Silent Wings. A Frigatebird-controlled LS-8b sailplane is trying to catch a thermal where another sailplane is already soaring on a windy day near Starmoen, Norway. For debugging convenience, Silent Wings indicates the centers of thermals and ridge lift, which are invisible in reality, with red arrows (this visualization can be disabled).<\/p><\/div>\n<p>To facilitate such experimentation for other researchers, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/Microsoft\/Frigatebird\/blob\/master\/SITL.md\">we released a software-in-the-loop (SITL) integration<\/a> between Frigatebird and a soaring flight simulator, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.silentwings.no\/\">Silent Wings<\/a>. Silent Wings is renowned for the fidelity of its soaring flight experience. Importantly for experiments like ours, it provides the most accurate modelling of the distribution of thermals and ridge lift across the natural landscape as a function of terrain features, time, and environmental conditions that we\u2019ve encountered in any simulator. This gives us confidence that Silent Wings\u2019 evaluation of long-range navigation strategies, which critically rely on these distributions, will yield qualitatively similar results to what we will see during field experiments.<\/p>\n<h3>Flight plan<\/h3>\n<p>Sensors let sailplane sUAVs reliably recognize when they are flying through a thermal, and techniques like POMDSoar let them soar higher, even in the weak turbulent thermals found at lower altitudes. However, without the ability to predict from a distance where thermals are, the sailplane drones can\u2019t devise a robust navigation strategy from point A to point B. To address this problem, in partnership with scientists from <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.asl.ethz.ch\/\">ETH Zurich\u2019s Autonomous Systems Lab<\/a>, we are researching remote thermal prediction and its integration with motion planning.<\/p>\n<p>Thermals appear due to warmer parts of the ground heating up the air above them and forcing it rise. Our joint efforts with ETH Zurich\u2019s team focus on detecting the temperature differences that cause this process, as well as other useful features from a distance, using infrared and optical cameras mounted on the sailplane, and forecasting thermal locations from them (see <strong>Figure 6<\/strong>.) However, infrared cameras cannot \u201csee\u201d such minute temperature variations in the air, and not every warm patch on the ground gives rise to a thermal, making this a hard but exciting problem. Integrating the resulting predictions with reinforcement learning for motion planning raises research challenges of its own due to the uncertainty in the predictions and difficulties in field evaluation of this approach.<\/p>\n<div id=\"attachment_581572\" style=\"width: 903px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/sailplane_predicting_thermal_locations.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-581572\" class=\"wp-image-581572 size-full\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/sailplane_predicting_thermal_locations.png\" alt=\"Figure 6: A schematic of a sailplane predicting thermal locations in front of itself by mapping the terrain with infrared and optical cameras. Image provided by ETH Zurich\u2019s Autonomous Systems Lab.\" width=\"893\" height=\"461\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/sailplane_predicting_thermal_locations.png 893w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/sailplane_predicting_thermal_locations-300x155.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/sailplane_predicting_thermal_locations-768x396.png 768w\" sizes=\"auto, (max-width: 893px) 100vw, 893px\" \/><\/a><p id=\"caption-attachment-581572\" class=\"wp-caption-text\">Figure 6: A schematic of a sailplane predicting thermal locations in front of itself by mapping the terrain with infrared and optical cameras. Image provided by ETH Zurich\u2019s Autonomous Systems Lab.<\/p><\/div>\n<h3>Crew<\/h3>\n<p>Building intelligence for a robotic platform that critically relies on, not merely copes with, highly variable atmospheric phenomena outdoors so that it can soar as well as the best soarers \u2013 birds! \u2013 takes expertise far beyond AI itself. To achieve our dream, we have been collaborating with experts from all over the world. <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.linkedin.com\/in\/iain-guilliard-223a2478\/\">Iain Guilliard<\/a>, a Ph.D. student from the Australian National University and a former intern at Microsoft Research, has been the driving force behind POMDSoar. <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.linkedin.com\/in\/samuel-tabor-90a19897\/\">Samuel Tabor<\/a>, a UK-based autonomous soaring enthusiast, has developed the alternative control-based ArduSoar approach and helped build the software-in-the-loop integration for Silent Wings. The <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/Microsoft\/Frigatebird\">Frigatebird autopilot<\/a>, which includes POMDSoar and ArduSoar, is based on the ArduPlane open-source project and on feedback from the international community of its developers. We are researching infrared\/optical vision-aided thermal prediction with our partners <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.asl.ethz.ch\/the-lab\/people\/person-detail.html?persid=244173\">Nicholas Lawrance<\/a>, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.asl.ethz.ch\/the-lab\/people\/person-detail.MjQ0MTg0.TGlzdC8xNTg0LDEyMDExMzk5Mjg=.html\">Jen Jen Chung<\/a>, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.timohinzmann.com\/\">Timo Hinzmann<\/a>, and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.asl.ethz.ch\/the-lab\/people\/person-detail.MTgwNDE5.TGlzdC8yMDMwLDEyMDExMzk5Mjg=.html\">Florian Achermann<\/a> at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.asl.ethz.ch\/\">ETH Zurich\u2019s Autonomous Systems Lab<\/a> led by <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.asl.ethz.ch\/the-lab\/people\/person-detail.html?persid=29981\">Roland Siegwart<\/a>. The know-how of all these people augments our project team\u2019s in-house expertise in automatic sequential decision-making, robotics\/vision (<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.debadeepta.com\/\">Debadeepta Dey<\/a>), and soaring (<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/rrogahn\/\">Rick Rogahn<\/a>).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The past few years have seen tremendous progress in reinforcement learning (RL). From complex games to robotic object manipulation, RL has qualitatively advanced the state of the art. However, modern RL techniques require a lot for success: a largely deterministic stationary environment, an accurate resettable simulator in which mistakes \u2013 and especially their consequences \u2013 [&hellip;]<\/p>\n","protected":false},"author":38022,"featured_media":582229,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"categories":[241770],"tags":[],"research-area":[13556,13562],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-580783","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-artificial-intelligence","msr-research-area-artificial-intelligence","msr-research-area-computer-vision","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[502862],"related-events":[],"related-researchers":[{"type":"user_nicename","value":"Andrey Kolobov","user_id":30910,"display_name":"Andrey Kolobov","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/akolobov\/\" aria-label=\"Visit the profile page for Andrey Kolobov\">Andrey Kolobov<\/a>","is_active":false,"last_first":"Kolobov, Andrey","people_section":0,"alias":"akolobov"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788-5cc745b40c58f.png\" class=\"img-object-cover\" alt=\"\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788-5cc745b40c58f.png 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788-5cc745b40c58f-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788-5cc745b40c58f-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788-5cc745b40c58f-1024x576.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788-5cc745b40c58f-1066x600.png 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788-5cc745b40c58f-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/Autonomous-Soaring-An-Open-World-Challenge-for-AI_Site_04_2019_1400x788-5cc745b40c58f-343x193.png 343w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/akolobov\/\" title=\"Go to researcher profile for Andrey Kolobov\" aria-label=\"Go to researcher profile for Andrey Kolobov\" data-bi-type=\"byline author\" data-bi-cN=\"Andrey Kolobov\">Andrey Kolobov<\/a>","formattedDate":"May 7, 2019","formattedExcerpt":"The past few years have seen tremendous progress in reinforcement learning (RL). From complex games to robotic object manipulation, RL has qualitatively advanced the state of the art. However, modern RL techniques require a lot for success: a largely deterministic stationary environment, an accurate resettable&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/580783","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/38022"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=580783"}],"version-history":[{"count":11,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/580783\/revisions"}],"predecessor-version":[{"id":582232,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/580783\/revisions\/582232"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/582229"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=580783"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=580783"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=580783"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=580783"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=580783"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=580783"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=580783"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=580783"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=580783"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=580783"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=580783"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}