Research Forum | Episode 4 - abstract chalkboard background with colorful network nodes and circular icons

Research Forum Brief | September 2024

Project Aurora: The first large-scale foundation model of the atmosphere

Share this page

Megan Stanley

“If we look at Aurora’s ability to predict pollutants such as nitrogen dioxide that are strongly related to emissions for human activity, we can see that the model has learned to make these predictions with no emissions data provided. It’s learned the implicit patterns that cause the gas concentrations, which is very impressive.”

Megan Stanley, Senior Researcher, Microsoft Research AI for Science

Transcript: Lightning Talk

Project Aurora: The first large-scale foundation model of the atmosphere

Megan Stanley, Senior Researcher, Microsoft Research AI for Science

This talk discusses Aurora, a cutting-edge foundation model that offers a new approach to weather forecasting that could transform our ability to predict and mitigate the impacts of extreme events, air pollution, and the changing climate.

Microsoft Research Forum, September 3, 2024

MEGAN STANLEY: Hi. My name is Megan Stanley, and I’m a senior researcher in Microsoft AI for Science, and I’d like to tell you all about Aurora, our foundation model of the atmosphere.

Now, weather forecasting is critical in our societies. Whether that’s for disaster management, planning supply chains and logistics, forecasting crop yields, or even just knowing whether we should take a jacket out when we leave the house in the morning, it has day-to-day significance for all of us and is very important to the functioning of our civilization. In addition, in the face of a changing climate, we need more than ever to predict how the patterns of our weather will change on an everyday basis as the earth system we all inhabit undergoes a shift.

Traditionally, the atmosphere and its interactions with the Earth’s surface and oceans, as well as the incoming energy from the sun, are modeled using very large systems of coupled differential equations. In practice, to make a forecast or simulate the atmosphere, these equations are numerically integrated on very large supercomputers. They also have to assimilate observations from the current state of the weather in order to have correct initial conditions. Putting all of this together means that making a single weather forecast is computationally extremely expensive and slow, and the simulation must be rerun for every new forecast. At the same time, the set of equations used cannot completely capture all of the atmospheric dynamics, and this ultimately limits the accuracy that can be obtained.

With Aurora, we aim to demonstrate state-of-the-art medium-range weather forecasting—that is, for time periods out to a couple of weeks—and to do so with a model that learns a good general representation of the atmosphere that can be tuned to many downstream tasks. It is our bet that, similar to the breakthroughs in natural language processing and image generation, we can make significant advances by training a large deep learning model on the vast quantity of Earth system data available to us.

Aurora represents huge progress. We demonstrate that it can be fine-tuned to state-of-the-art performance on operational weather forecasting, as well as previously unexplored areas in deep learning of atmospheric pollution prediction. It’s able to do all of this roughly 5,000 times faster than current traditional weather forecasting techniques. In addition, if we compare to the current state of the art in AI weather forecasting, the GraphCast model, we’re able to outperform it on 94 percent of targets, and we do so at a higher spatial resolution in line with the current traditional state of the art.

Aurora achieves this by training on more data and more data that is more diverse, training a larger model at the same time. We also demonstrate that, as a foundation model, it has the possibility of being fine-tuned on a wide range of very important downstream tasks. As a foundation model, Aurora operates using the pretrain–fine-tune paradigm. It’s initially trained on a large quantity of traditional weather forecasting and climate simulation data. This pretraining phase is designed to result in a model that should carry within it a useful representation of the general behavior of the atmosphere so that then we can fine-tune it to operate in scenarios where there is much less data or data of less high quality.

So examples of the scarce data scenario? Well, weather forecasting at the resolution of the current gold standard of traditional methods, that is the IFS system, operating at 0.1 degrees resolution, or approximately 10 kilometers. Another good example is prediction of atmospheric pollution, including gases and particulates, where the current gold standard is an additional, very computationally expensive model applied to the IFS from the Copernicus atmospheric modeling service, or CAMS. This problem is generally very challenging to traditional forecasting systems, but it’s of critical importance.

We’re able to show that Aurora outperforms IFS on 92 percent of the operational targets, and it does this particularly well in comparison at forecasting times longer than 12 hours while being approximately 5,000 times faster. When we look at the ability of Aurora to predict weather station observations, including wind speed and temperature, it’s better in general than traditional forecasting systems. It really is able to make accurate predictions of the weather as we experience it on Earth.

On the atmospheric pollution task, Aurora is able to match or outperform CAMS in 74 percent of cases, and it does so without needing any emissions data as an input. This task has never before been approached with an AI model. If we look at Aurora’s ability to predict pollutants such as nitrogen dioxide that are strongly related to emissions for human activity, we can see that the model has learned to make these predictions with no emissions data provided. It’s learned the implicit patterns that cause the gas concentrations, which is very impressive. It’s also, very impressively, managed to learn atmospheric chemistry behavior. You can see this here, where as the gas is exposed to sunlight, this causes the changes between night and day concentrations of nitrogen dioxide.

Aurora is also capable of forecasting extreme events as well as the state-of-the-art traditional techniques. Here it is seen correctly predicting the path of storm Ciarán, which hit Northwestern Europe in early November 2023, causing record-breaking damage and destruction. In particular, Aurora was the only AI model that could correctly predict the maximum wind speed during the storm as it picked up when it made landfall.

In conclusion, Aurora is a foundation model that really is the state of the art in AI and in general weather forecasting in terms of its ability to produce correct operational forecasts. It does so 5,000 times faster than traditional weather forecasting techniques. Moreover, because it’s a foundation model, it unlocks new capabilities. It can be fine-tuned on downstream tasks where there’s scarce data or that haven’t been approached before. We believe that Aurora represents an incredibly exciting new paradigm in weather forecasting. This is much like the progress we’ve seen across the sciences, where the ability to train AI models at massive scale with vast quantities of accurate data, has unlocked completely unforeseen capabilities.

If you want to learn more about how my colleagues and I at AI for Science achieve this, please refer to our publication. Thank you.