{"id":1040643,"date":"2024-06-03T09:00:00","date_gmt":"2024-06-03T16:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=1040643"},"modified":"2024-06-14T08:09:03","modified_gmt":"2024-06-14T15:09:03","slug":"introducing-aurora-the-first-large-scale-foundation-model-of-the-atmosphere","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/introducing-aurora-the-first-large-scale-foundation-model-of-the-atmosphere\/","title":{"rendered":"Introducing Aurora: The first large-scale foundation model of the atmosphere"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"788\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1.jpg\" alt=\"satellite image of Storm Ciar\u00e1n\" class=\"wp-image-1041021\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1.jpg 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-1280x720.jpg 1280w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><\/figure>\n\n\n\n<p>When Storm Ciar\u00e1n battered northwestern Europe in November 2023, it left a trail of destruction. The low-pressure system associated with Storm Ciar\u00e1n set new records for England, marking it as an exceptionally rare meteorological event. The storm&#8217;s intensity caught many off guard, exposing the limitations of current weather-prediction models and highlighting the need for more accurate forecasting in the face of climate change. As communities grappled with the aftermath, the urgent question arose: How can we better anticipate and prepare for such extreme weather events?&nbsp;<\/p>\n\n\n\n<p>A recent study by Charlton-Perez et al. (2024) underscored the challenges faced by even the most advanced AI weather-prediction models in capturing the rapid intensification and peak wind speeds of Storm Ciar\u00e1n. To help address those challenges, a team of Microsoft researchers developed <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/aurora-a-foundation-model-of-the-atmosphere\/\">Aurora, a cutting-edge AI foundation model that can extract valuable insights from vast amounts of atmospheric data<\/a>. Aurora presents a new approach to weather forecasting that could transform our ability to predict and mitigate the impacts of extreme events\u2014including being able to anticipate the dramatic escalation of an event like Storm Ciar\u00e1n.&nbsp;&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"a-flexible-3d-foundation-model-of-the-atmosphere\">A flexible 3D foundation model of the atmosphere<\/h2>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"4705\" height=\"2144\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/FIG1_Aurora.png\" alt=\"Aurora is a 1.3 billion parameter foundation model for high-resolution  forecasting of weather and atmospheric processes. Aurora is a flexible 3D Swin Transformer with 3D Perceiver-based encoders and decoders. At pretraining time, Aurora is optimised to minimise a loss on multiple heterogeneous datasets with different resolutions, variables, and pressure levels. The model is then fine-tuned in two stages: (1) short-lead time fine-tuning of the pretrained weights (2) long-lead time (rollout) fine-tuning using Low Rank Adaptation (LoRA). The fine-tuned models are then deployed to tackle a diverse collection of operational forecasting scenarios at different resolutions. \" class=\"wp-image-1042092\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/FIG1_Aurora.png 4705w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/FIG1_Aurora-300x137.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/FIG1_Aurora-1024x467.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/FIG1_Aurora-768x350.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/FIG1_Aurora-1536x700.png 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/FIG1_Aurora-2048x933.png 2048w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/FIG1_Aurora-240x109.png 240w\" sizes=\"auto, (max-width: 4705px) 100vw, 4705px\" \/><figcaption class=\"wp-element-caption\">Figure 1: Aurora is a 1.3 billion parameter foundation model for high-resolution&nbsp;forecasting of weather and atmospheric processes. Aurora is a flexible 3D Swin Transformer with 3D Perceiver-based encoders and decoders. At pretraining time, Aurora is optimized to minimize a loss on multiple heterogeneous datasets with different resolutions, variables, and pressure levels. The model is then fine-tuned in two stages: (1) short-lead time fine-tuning of the pretrained weights and (2) long-lead time (rollout) fine-tuning using Low Rank Adaptation (LoRA). The fine-tuned models are then deployed to tackle a diverse collection of operational forecasting scenarios at different resolutions.<\/figcaption><\/figure>\n\n\n\n<p>Aurora&#8217;s effectiveness lies in its training on more than a million hours of diverse weather and climate simulations, which enables it to develop a comprehensive understanding of atmospheric dynamics. This allows the model to excel at a wide range of prediction tasks, even in data-sparse regions or extreme weather scenarios. By operating at a high spatial resolution of 0.1\u00b0 (roughly 11 km at the equator), Aurora captures intricate details of atmospheric processes, providing more accurate operational forecasts than ever before\u2014and at a fraction of the computational cost of traditional numerical weather-prediction systems. We estimate that the computational speed-up that Aurora can bring over the state-of-the-art numerical forecasting system Integrated Forecasting System (IFS) is ~5,000x.&nbsp;<\/p>\n\n\n\n<p>Beyond its impressive accuracy and efficiency, Aurora stands out for its versatility. The model can forecast a broad range of atmospheric variables, from temperature and wind speed to air-pollution levels and concentrations of greenhouse gases. Aurora&#8217;s architecture is designed to handle heterogeneous gold standard inputs and generate predictions at different resolutions and levels of fidelity. The model consists of a flexible 3D Swin Transformer with Perceiver-based encoders and decoders, enabling it to process and predict a range of atmospheric variables across space and pressure levels. By pretraining on a vast corpus of diverse data and fine-tuning on specific tasks, Aurora learns to capture intricate patterns and structures in the atmosphere, allowing it to excel even with limited training data when it is being fine-tuned for a specific task.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"fast-prediction-of-atmospheric-chemistry-and-air-pollution\">Fast prediction of atmospheric chemistry and air pollution<\/h2>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"642\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/cams_prediction_tcno2.png\" alt=\"Sample predictions for total column nitrogen dioxide by Aurora compared to CAMS analysis. Aurora was initialised with CAMS analysis at 1 Sep 2022 00 UTC. Predicting atmospheric gasses correctly is extremely challenging due to their spatially heterogeneous nature. In particular, nitrogen dioxide, like most variables in CAMS, is skewed towards high values in areas with large anthropogenic emissions such as densely populated areas in East Asia. In addition, it exhibits a strong diurnal cycle; e.g., sunlight reduces background levels through a process called photolysis. Aurora accurately captures both the extremes and background levels. \" class=\"wp-image-1042197\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/cams_prediction_tcno2.png 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/cams_prediction_tcno2-300x138.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/cams_prediction_tcno2-1024x470.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/cams_prediction_tcno2-768x352.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/cams_prediction_tcno2-240x110.png 240w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"945\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/scorecard_cams_full.png\" alt=\"Latitude-weighted root mean square error (RMSE) of Aurora relative to CAMS, where negative values (blue) mean that Aurora is better. The RMSEs are computed over the period Jun 2022 to Nov 2022 inclusive. Aurora matches or outperforms CAMS on 74% of the targets. \" class=\"wp-image-1042203\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/scorecard_cams_full.png 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/scorecard_cams_full-300x203.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/scorecard_cams_full-1024x691.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/scorecard_cams_full-768x518.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/scorecard_cams_full-240x162.png 240w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><figcaption class=\"wp-element-caption\">Figure 2: Aurora outperforms operational CAMS across many targets. (a) Sample predictions for total column nitrogen dioxide by Aurora compared to CAMS analysis. Aurora was initialized with CAMS analysis at 1 Sep 2022 00 UTC. Predicting atmospheric gases correctly is extremely challenging due to their spatially heterogeneous nature. In particular, nitrogen dioxide, like most variables in CAMS, is skewed toward high values in areas with large anthropogenic emissions, such as densely populated areas in East Asia. In addition, it exhibits a strong diurnal cycle; e.g., sunlight reduces background levels via a process called photolysis. Aurora accurately captures both the extremes and background levels. (b) Latitude-weighted root mean square error (RMSE) of Aurora relative to CAMS, where negative values (blue) mean that Aurora is better. The RMSEs are computed over the period Jun 2022 to Nov 2022 inclusive. Aurora matches or outperforms CAMS on 74% of the targets.<\/figcaption><\/figure>\n\n\n\n<p>A prime example of Aurora&#8217;s versatility is its ability to forecast air-pollution levels using data from the Copernicus Atmosphere Monitoring Service (CAMS), a notoriously difficult task due to the complex interplay of atmospheric chemistry, weather patterns, and human activities, as well as the highly heterogeneous nature of CAMS data. By leveraging its flexible encoder-decoder architecture and attention mechanisms, Aurora effectively processes and learns from this challenging data, capturing the unique characteristics of air pollutants and their relationships with meteorological variables. This enables Aurora to produce accurate five-day global air-pollution forecasts at 0.4\u00b0 spatial resolution, outperforming state-of-the-art atmospheric chemistry simulations on 74% of all targets, demonstrating its remarkable adaptability and potential to tackle a wide range of environmental prediction problems, even in data-sparse or highly complex scenarios.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"data-diversity-and-model-scaling-improve-atmospheric-forecasting\">Data diversity and model scaling improve atmospheric forecasting<\/h2>\n\n\n\n<p>One of the key findings of this study is that pretraining on diverse datasets significantly improves Aurora&#8217;s performance compared to training on a single dataset. By incorporating data from climate simulations, reanalysis products, and operational forecasts, Aurora learns a more robust and generalizable representation of atmospheric dynamics. It is thanks to its scale and diverse pretraining data corpus that Aurora is able outperform state-of-the-art numerical weather-prediction models and specialized deep-learning approaches across a wide range of tasks and resolutions.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"445\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/data_scaling_era5_main.png\" alt=\"Performance versus ERA5 2021 at 6h lead time for models pretrained on different dataset configurations (i.e., no fine-tuning) labeled by C1-C4. The root mean square errors (RMSEs) are normalised by the performance of the ERA5-pretrained model (C1). Adding low-fidelity simulation data from CMIP6 (i.e., CMCC and IFS-HR) improves performance almost uniformly (C2). Adding even more simulation data improves performance further on most surface variables and for the atmospheric levels present in this newly added data (C3). Finally, configuration C4, which contains a good coverage of the entire atmosphere and also contains analysis data from GFS achieves the best overall performance with improvements across the board. \" class=\"wp-image-1042209\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/data_scaling_era5_main.png 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/data_scaling_era5_main-300x95.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/data_scaling_era5_main-1024x325.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/data_scaling_era5_main-768x244.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/data_scaling_era5_main-240x76.png 240w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"391\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/norm_method_A_B_C_2_two_sided_CI_fulllines_6h.png\" alt=\"Pretraining on many diverse data sources improves the forecasting of extreme values at 6h lead time across all surface variables of IFS-HRES 2022. Additionally, the results also hold on wind speed, which is a nonlinear function of 10U and 10V. \" class=\"wp-image-1042572\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/norm_method_A_B_C_2_two_sided_CI_fulllines_6h.png 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/norm_method_A_B_C_2_two_sided_CI_fulllines_6h-300x84.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/norm_method_A_B_C_2_two_sided_CI_fulllines_6h-1024x286.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/norm_method_A_B_C_2_two_sided_CI_fulllines_6h-768x214.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/norm_method_A_B_C_2_two_sided_CI_fulllines_6h-240x67.png 240w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"385\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/model_scaling_val.png\" alt=\"Bigger models obtain lower validation loss for the same amount of GPU hours. We fit a power law that indicates a 5% reduction in the validation loss for every doubling of the model size. \" class=\"wp-image-1042215\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/model_scaling_val.png 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/model_scaling_val-300x83.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/model_scaling_val-1024x282.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/model_scaling_val-768x211.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/model_scaling_val-240x66.png 240w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><figcaption class=\"wp-element-caption\">Figure 3: Pretraining on diverse data and increasing model size improves performance. (a) Performance versus ERA5 2021 at 6h lead time for models pretrained on different dataset configurations (i.e., no fine-tuning) labeled by C1-C4. The root mean square errors (RMSEs) are normalized by the performance of the ERA5-pretrained model (C1). Adding low-fidelity simulation data from CMIP6 (i.e., CMCC and IFS-HR) improves performance almost uniformly (C2). Adding even more simulation data improves performance further on most surface variables and for the atmospheric levels present in this newly added data (C3). Finally, configuration C4, which contains good coverage of the entire atmosphere and also contains analysis data from GFS achieves the best overall performance with improvements across the board. (b) Pretraining on many diverse data sources improves the forecasting of extreme values at 6h lead time across all surface variables of IFS-HRES 2022. Additionally, the results also hold on wind speed, which is a nonlinear function of 10U and 10V. (c) Bigger models obtain lower validation loss for the same amount of GPU hours. We fit a power law that roughly translates into a 5 reduction in the training loss for every doubling of the model size.<\/figcaption><\/figure>\n\n\n\n<p>A direct consequence of Aurora&#8217;s scale, both in terms of architecture design and training data corpus, as well as its pretraining and fine-tuning protocols, is its superior performance over the best specialized deep learning models. As an additional validation of the benefits of fine-tuning a large model pretrained on many datasets, we compare Aurora against GraphCast &#8212; pretrained only on ERA5 and currently considered the most skillful AI model at 0.25-degree resolution and lead times up to five days. Additionally, we include IFS HRES in this comparison, the gold standard in numerical weather prediction. We show that Aurora outperforms both when measured against analysis, weather station observations, and extreme values.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"394\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/scorecard_gc_0.25_A.png\" alt=\"Scorecard versus GraphCast at 0.25-degrees resolution. Aurora matches or outperforms GraphCast on 94% of targets. Aurora obtains the biggest gains (40%) over GraphCast in the upper atmosphere, where GraphCast performance is known to be poor. Large improvements up to 10-15% are observed at short and long lead times. The two models are closest to each other in the lower atmosphere at the 2--3 day lead time, which corresponds to the lead time GraphCast was rollout-finetuned on. At the same time, GraphCast shows slightly better performance up to five days and at most levels on specific humidity (Q). \" class=\"wp-image-1042371\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/scorecard_gc_0.25_A.png 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/scorecard_gc_0.25_A-300x84.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/scorecard_gc_0.25_A-1024x288.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/scorecard_gc_0.25_A-768x216.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/scorecard_gc_0.25_A-240x68.png 240w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1393\" height=\"479\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/weather_station_obs_B.png\" alt=\"Root mean square error (RMSE) for Aurora, GraphCast, and IFS-HRES as measured by global weather stations during 2022 for wind speed and surface temperature.\" class=\"wp-image-1042377\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/weather_station_obs_B.png 1393w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/weather_station_obs_B-300x103.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/weather_station_obs_B-1024x352.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/weather_station_obs_B-768x264.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/weather_station_obs_B-240x83.png 240w\" sizes=\"auto, (max-width: 1393px) 100vw, 1393px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1407\" height=\"627\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/new_norm_method_2_two_sided_CI_fulllines_C.png\" alt=\"Thresholded RMSE for Aurora, GraphCast and IFS-HRES normalized by IFS-HRES performance. Aurora demonstrates improved prediction for the extreme values, or tails, of the surface variable distributions. In each plot values to the right of the centre line are cumulative RMSEs for targets found to sit above the threshold, and those to the left represent target values sitting below the threshold. \" class=\"wp-image-1042383\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/new_norm_method_2_two_sided_CI_fulllines_C.png 1407w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/new_norm_method_2_two_sided_CI_fulllines_C-300x134.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/new_norm_method_2_two_sided_CI_fulllines_C-1024x456.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/new_norm_method_2_two_sided_CI_fulllines_C-768x342.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/new_norm_method_2_two_sided_CI_fulllines_C-240x107.png 240w\" sizes=\"auto, (max-width: 1407px) 100vw, 1407px\" \/><figcaption class=\"wp-element-caption\">Figure 4: Aurora outperforms operational GraphCast across the vast majority of targets. (a) Scorecard versus GraphCast at 0.25-degrees resolution. Aurora matches or outperforms GraphCast on 94% of targets. Aurora obtains the biggest gains (40%) over GraphCast in the upper atmosphere, where GraphCast performance is known to be poor. Large improvements up to 10%-15% are observed at short and long lead times. The two models are closest to each other in the lower atmosphere at the 2-3 day lead time, which corresponds to the lead time GraphCast was rollout-finetuned on. At the same time, GraphCast shows slightly better performance up to five days and at most levels on specific humidity (Q). (b) Root mean square error (RMSE) and mean absolute error (MAE) for Aurora, GraphCast, and IFS-HRES as measured by global weather stations during 2022 for wind speed (left two panels) and surface temperature (right two panels). (c) Thresholded RMSE for Aurora, GraphCast and IFS-HRES normalized by IFS-HRES performance. Aurora demonstrates improved prediction for the extreme values, or tails, of the surface variable distributions. In each plot values to the right of the center line are cumulative RMSEs for targets found to sit above the threshold, and those to the left represent target values sitting below the threshold.<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"a-paradigm-shift-in-earth-system-modeling\">A paradigm shift in Earth system modeling&nbsp;<\/h2>\n\n\n\n<p>The implications of Aurora extend far beyond atmospheric forecasting. By demonstrating the power of foundation models in the Earth sciences, this research paves the way for the development of comprehensive models that encompass the entire Earth system. The ability of foundation models to excel at downstream tasks with scarce data could democratize access to accurate weather and climate information in data-sparse regions, such as the developing world and polar regions. This could have far-reaching impacts on sectors like agriculture, transportation, energy harvesting, and disaster preparedness, enabling communities to better adapt to the challenges posed by climate change.&nbsp;<\/p>\n\n\n\n<p>As the field of AI-based environmental prediction evolves, we hope Aurora will serve as a blueprint for future research and development. The study highlights the importance of diverse pretraining data, model scaling, and flexible architectures in building powerful foundation models for the Earth sciences. With continued advancements in computational resources and data availability, we can envision a future where foundation models like Aurora become the backbone of operational weather and climate prediction systems, providing timely, accurate, and actionable insights to decision-makers and the public worldwide.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"acknowledgements\">Acknowledgements<\/h2>\n\n\n\n<p>We are grateful for the contributions of Cristian Bodnar, a core contributor to this project.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Aurora, a new AI foundation model from Microsoft Research, can transform our ability to predict and mitigate extreme weather events and the effects of climate change by enabling faster and more accurate weather forecasts than ever before. <\/p>\n","protected":false},"author":37583,"featured_media":1041021,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"categories":[1],"tags":[],"research-area":[13556],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[243984],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-1040643","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-post-option-blog-homepage-featured"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[851467],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[1045410],"related-events":[],"related-researchers":[{"type":"user_nicename","value":"Wessel Bruinsma","user_id":42339,"display_name":"Wessel Bruinsma","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/wbruinsma\/\" aria-label=\"Visit the profile page for Wessel Bruinsma\">Wessel Bruinsma<\/a>","is_active":false,"last_first":"Bruinsma, Wessel","people_section":0,"alias":"wbruinsma"},{"type":"user_nicename","value":"Megan Stanley","user_id":41482,"display_name":"Megan Stanley","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/meganstanley\/\" aria-label=\"Visit the profile page for Megan Stanley\">Megan Stanley<\/a>","is_active":false,"last_first":"Stanley, Megan","people_section":0,"alias":"meganstanley"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-960x540.jpg\" class=\"img-object-cover\" alt=\"satellite image of Storm Ciar\u00e1n\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/NEW_Aurora-2024-BlogHeroFeature-1400x788-1.jpg 1400w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"","formattedDate":"June 3, 2024","formattedExcerpt":"Aurora, a new AI foundation model from Microsoft Research, can transform our ability to predict and mitigate extreme weather events and the effects of climate change by enabling faster and more accurate weather forecasts than ever before.","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1040643","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/37583"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=1040643"}],"version-history":[{"count":43,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1040643\/revisions"}],"predecessor-version":[{"id":1047720,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1040643\/revisions\/1047720"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1041021"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1040643"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=1040643"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=1040643"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1040643"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=1040643"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=1040643"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1040643"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1040643"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1040643"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=1040643"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=1040643"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}