{"id":580783,"date":"2019-05-07T08:08:45","date_gmt":"2019-05-07T15:08:45","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=580783"},"modified":"2019-07-08T09:46:33","modified_gmt":"2019-07-08T16:46:33","slug":"autonomous-soaring-ai-on-the-fly","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/autonomous-soaring-ai-on-the-fly\/","title":{"rendered":"Autonomous soaring \u2013 AI on the fly"},"content":{"rendered":"

\"\"<\/a><\/p>\n

The past few years have seen tremendous progress in reinforcement learning (RL). From complex games to robotic object manipulation, RL has qualitatively advanced the state of the art. However, modern RL techniques require a lot for success: a largely deterministic stationary environment, an accurate resettable simulator in which mistakes \u2013 and especially their consequences \u2013 are limited to the virtual sphere, powerful computers, and a lot of energy to run them. At Microsoft Research, we are working towards automatic decision-making approaches that bring us closer to the vision of AI agents capable of learning and acting autonomously in changeable open-world conditions using the limited onboard compute. Project Frigatebird<\/a> is our ambitious quest in this space, aimed at building intelligence that can enable small fixed-wing uninhabited aerial vehicles (sUAVs) to stay aloft purely by extracting energy from moving air.<\/p>\n

Let\u2019s talk hardware<\/h3>\n

Snipe 2, our latest sUAV, pictured above, exemplifies Project Frigatebird\u2019s hardware platforms. It is a small version of a special type of human-piloted aircraft known as sailplanes, also called gliders. Like many sailplanes, Snipe 2 doesn\u2019t have a motor; even sailplanes that do, carry just enough power to run it for only a minute or two. Snipe 2 is hand-tossed into the air to an altitude of approximately 60 meters and then slowly descends to the ground\u2014unless it finds a rising air current called a thermal (see Figure 2<\/strong>) and exploits it to soar higher. For human pilots in full-scale sailplanes, travelling hundreds of miles solely powered on these naturally occurring sources of lift is a popular sport. For certain birds like albatrosses or frigatebirds, covering great distances in this way with nary a wing flap is a natural-born skill. A skill that we would very much like to bestow on Snipe 2\u2019s AI.<\/p>\n

\"Figure<\/a>

Figure 1: the layout of hardware for autonomous soaring in Snipe 2’s narrow fuselage.<\/p><\/div>\n

Snipe 2\u2019s 1.5 meter-wingspan airframe weighs a mere 163 grams, its slender fuselage only 35 mm wide at its widest spot. Yet it carries an off-the-shelf Pixhawk 4 Mini flight controller and all requisite peripherals for fully autonomous flight (see Figure 1.) This \u201cbrain\u201d has more than enough punch to run our Bayesian reinforcement learning-based soaring algorithm, POMDSoar<\/a>. It can also receive a strategic, more computationally heavy, navigation policy over the radio from a laptop on the ground, further enhancing the sUAV\u2019s ability to find columns of rising air. Alternatively, Snipe 2 can house more powerful but still sufficiently compact hardware such as Raspberry Pi Zero to compute this policy onboard. Our larger sailplane drones like the 5-meter wingspan Thermik XXXL can carry even more sophisticated equipment, including cameras and a computational platform for processing their data in real time for hours on end. Indeed, nowadays the only barrier preventing winged drones from staying aloft for this long on atmospheric energy alone in favorable weather is the lack of sufficient AI capabilities.<\/p>\n

Reaching higher<\/h3>\n

Why is building this intelligence hard? Exactly because of the factors that limit modern RL\u2019s applicability. Autopilots of conventional aircraft are built on fairly simple control-based approaches. This strategy works because an aircraft\u2019s motors, in combination with its wings, deliver a stable source of lift, allowing it to \u201coverpower\u201d most of variable factors affecting its flight, for example, wind. Sailplanes, on the other hand, are \u201cunderactuated\u201d and must make use of \u2013 not overpower \u2013 highly uncertain and non-stationary atmospheric phenomena to stay aloft. Thermals, the columns of upward-moving air in which hawks and other birds are often seen gracefully circling, are an example of these stochastic phenomena. A thermal can disappear minutes after appearing, and the amount of lift if provides varies across its lifecycle, with altitude, and with distance from the thermal center. Finding thermals is a difficult problem in itself. They cannot be seen directly; a sailplane can infer their size and location only approximately. Human pilots rely on local knowledge, ground features, observing the behavior of birds and other sailplanes, and other cues, in addition to instrument readings, to guess where thermals are. Interpreting some of these cues involves simple-sounding but nontrivial computer vision problems\u2014for example, estimating distance to objects seen against the background of featureless sky. Decision-making based on these observations is even more complicated. It requires integrating diverse sensor data on hardware far less capable than a human brain, and accounting for large amounts of uncertainty over large planning horizons. Accurately inferring the consequences of various decisions using simulations, a common approach in modern RL, is thwarted under these conditions by the lack of onboard compute and energy to run it.<\/p>\n

\"Figure<\/a>

Figure 3: (Left) A schematic depiction of air movement within thermals and a sailplane’s trajectory. (Right) A visualization of an actual thermal soaring trajectory from one of our sUAVs\u2019 flights.<\/p><\/div>\n

Our first steps have focused on using thermals to gain altitude:<\/p>\n