Research Forum Brief | September 2024 Articles http://approjects.co.za/?big=en-us/research/ Tue, 03 Sep 2024 19:08:07 +0000 en-US hourly 1 https://wordpress.org/?v=6.6.2 Project Aurora: The first large-scale foundation model of the atmosphere http://approjects.co.za/?big=en-us/research/articles/project-aurora-the-first-large-scale-foundation-model-of-the-atmosphere/ Tue, 03 Sep 2024 19:08:05 +0000 http://approjects.co.za/?big=en-us/research/?post_type=msr-blog-post&p=1079052 This talk discusses Aurora, a cutting-edge foundation model that offers a new approach to weather forecasting that could transform our ability to predict and mitigate the impacts of extreme events, air pollution, and the changing climate.

The post Project Aurora: The first large-scale foundation model of the atmosphere appeared first on Microsoft Research.

]]>
Presented by Megan Stanley at Microsoft Research Forum, September 2024

Megan Stanley

“If we look at Aurora’s ability to predict pollutants such as nitrogen dioxide that are strongly related to emissions for human activity, we can see that the model has learned to make these predictions with no emissions data provided. It’s learned the implicit patterns that cause the gas concentrations, which is very impressive.”

Megan Stanley, Senior Researcher, Microsoft Research AI for Science

Transcript: Lightning Talk

Project Aurora: The first large-scale foundation model of the atmosphere

Megan Stanley, Senior Researcher, Microsoft Research AI for Science

This talk discusses Aurora, a cutting-edge foundation model that offers a new approach to weather forecasting that could transform our ability to predict and mitigate the impacts of extreme events, air pollution, and the changing climate.

Microsoft Research Forum, September 3, 2024

MEGAN STANLEY: Hi. My name is Megan Stanley, and I’m a senior researcher in Microsoft AI for Science, and I’d like to tell you all about Aurora, our foundation model of the atmosphere.

Now, weather forecasting is critical in our societies. Whether that’s for disaster management, planning supply chains and logistics, forecasting crop yields, or even just knowing whether we should take a jacket out when we leave the house in the morning, it has day-to-day significance for all of us and is very important to the functioning of our civilization. In addition, in the face of a changing climate, we need more than ever to predict how the patterns of our weather will change on an everyday basis as the earth system we all inhabit undergoes a shift.

Traditionally, the atmosphere and its interactions with the Earth’s surface and oceans, as well as the incoming energy from the sun, are modeled using very large systems of coupled differential equations. In practice, to make a forecast or simulate the atmosphere, these equations are numerically integrated on very large supercomputers. They also have to assimilate observations from the current state of the weather in order to have correct initial conditions. Putting all of this together means that making a single weather forecast is computationally extremely expensive and slow, and the simulation must be rerun for every new forecast. At the same time, the set of equations used cannot completely capture all of the atmospheric dynamics, and this ultimately limits the accuracy that can be obtained.

With Aurora, we aim to demonstrate state-of-the-art medium-range weather forecasting—that is, for time periods out to a couple of weeks—and to do so with a model that learns a good general representation of the atmosphere that can be tuned to many downstream tasks. It is our bet that, similar to the breakthroughs in natural language processing and image generation, we can make significant advances by training a large deep learning model on the vast quantity of Earth system data available to us.

Aurora represents huge progress. We demonstrate that it can be fine-tuned to state-of-the-art performance on operational weather forecasting, as well as previously unexplored areas in deep learning of atmospheric pollution prediction. It’s able to do all of this roughly 5,000 times faster than current traditional weather forecasting techniques. In addition, if we compare to the current state of the art in AI weather forecasting, the GraphCast model, we’re able to outperform it on 94 percent of targets, and we do so at a higher spatial resolution in line with the current traditional state of the art.

Aurora achieves this by training on more data and more data that is more diverse, training a larger model at the same time. We also demonstrate that, as a foundation model, it has the possibility of being fine-tuned on a wide range of very important downstream tasks. As a foundation model, Aurora operates using the pretrain–fine-tune paradigm. It’s initially trained on a large quantity of traditional weather forecasting and climate simulation data. This pretraining phase is designed to result in a model that should carry within it a useful representation of the general behavior of the atmosphere so that then we can fine-tune it to operate in scenarios where there is much less data or data of less high quality.

So examples of the scarce data scenario? Well, weather forecasting at the resolution of the current gold standard of traditional methods, that is the IFS system, operating at 0.1 degrees resolution, or approximately 10 kilometers. Another good example is prediction of atmospheric pollution, including gases and particulates, where the current gold standard is an additional, very computationally expensive model applied to the IFS from the Copernicus atmospheric modeling service, or CAMS. This problem is generally very challenging to traditional forecasting systems, but it’s of critical importance.

We’re able to show that Aurora outperforms IFS on 92 percent of the operational targets, and it does this particularly well in comparison at forecasting times longer than 12 hours while being approximately 5,000 times faster. When we look at the ability of Aurora to predict weather station observations, including wind speed and temperature, it’s better in general than traditional forecasting systems. It really is able to make accurate predictions of the weather as we experience it on Earth.

On the atmospheric pollution task, Aurora is able to match or outperform CAMS in 74 percent of cases, and it does so without needing any emissions data as an input. This task has never before been approached with an AI model. If we look at Aurora’s ability to predict pollutants such as nitrogen dioxide that are strongly related to emissions for human activity, we can see that the model has learned to make these predictions with no emissions data provided. It’s learned the implicit patterns that cause the gas concentrations, which is very impressive. It’s also, very impressively, managed to learn atmospheric chemistry behavior. You can see this here, where as the gas is exposed to sunlight, this causes the changes between night and day concentrations of nitrogen dioxide.

Aurora is also capable of forecasting extreme events as well as the state-of-the-art traditional techniques. Here it is seen correctly predicting the path of storm Ciarán, which hit Northwestern Europe in early November 2023, causing record-breaking damage and destruction. In particular, Aurora was the only AI model that could correctly predict the maximum wind speed during the storm as it picked up when it made landfall.

In conclusion, Aurora is a foundation model that really is the state of the art in AI and in general weather forecasting in terms of its ability to produce correct operational forecasts. It does so 5,000 times faster than traditional weather forecasting techniques. Moreover, because it’s a foundation model, it unlocks new capabilities. It can be fine-tuned on downstream tasks where there’s scarce data or that haven’t been approached before. We believe that Aurora represents an incredibly exciting new paradigm in weather forecasting. This is much like the progress we’ve seen across the sciences, where the ability to train AI models at massive scale with vast quantities of accurate data, has unlocked completely unforeseen capabilities.

If you want to learn more about how my colleagues and I at AI for Science achieve this, please refer to our publication. Thank you.

The post Project Aurora: The first large-scale foundation model of the atmosphere appeared first on Microsoft Research.

]]>
MatterGen: A Generative Model for Materials Design http://approjects.co.za/?big=en-us/research/articles/mattergen-a-generative-model-for-materials-design/ Tue, 04 Jun 2024 18:07:15 +0000 http://approjects.co.za/?big=en-us/research/?post_type=msr-blog-post&p=1035024 Tian Xie introduces MatterGen, a generative model that creates new inorganic materials based on a broad range of property conditions required by the application, aiming to shift the traditional paradigm of materials design with generative AI.

The post MatterGen: A Generative Model for Materials Design appeared first on Microsoft Research.

]]>
Presented by Tian Xie at Microsoft Research Forum, June 2024

Tian Xie

“Materials design is the cornerstone of modern technology. Many of the challenges our society is facing today are bottlenecked by finding a good material. … If we can find a novel material that conducts lithium very well, it will be a key component for our next-generation battery technology. The same applies to many other domains.”

Tian Xie, Principal Research Manager, Microsoft Research AI for Science

Transcript: Lightning Talk

MatterGen: A Generative Model for Materials Design

Tian Xie, Principal Research Manager, Microsoft Research AI for Science

Tian Xie introduces MatterGen, a generative model that creates new inorganic materials based on a broad range of property conditions required by the application, aiming to shift the traditional paradigm of materials design with generative AI.

Microsoft Research Forum, June 4, 2024

TIAN XIE: Hello, everyone. My name is Tian, and I’m from Microsoft Research AI for Science. I’m excited to be here to share with you MatterGen, our latest model that brings generative AI to materials design.

Materials design is the cornerstone of modern technology. Many of the challenges our society is facing today are bottlenecked by finding a good material. For example, if we can find a novel material that conducts lithium very well, it will be a key component for our next-generation battery technology. The same applies to many other domains, like finding a novel material for solar cells, carbon capture, and quantum computers. Traditionally, materials design is conducted by search-based methods. We search through a list of candidates and gradually filter them using a list of design criteria for the application. Like for batteries, we need the materials to contain lithium, to be stable, to have a high lithium-ion conductivity, and each filtering step can be conducted using simulation-based methods or AI emulators. At the end, we get five to 10 candidates that we’re sending to the lab for experimental synthesis.

In MatterGen, we hope to rethink this process with generative AI. We’re aiming to directly generate materials given the design requirements for the target application, bypassing the process of searching through candidates. You can think of it as using text-to-image generative models like DALL-E to generate the images given a prompt rather than needing to search through the entire internet for images via a search engine. The core of MatterGen is a diffusion model specifically designed for materials. A material can be represented by its unit cell, the smallest repeating unit of the infinite periodic structure. It has three components: atom types, atom positions, and periodic lattice. We designed the forward process to corrupt all three components towards a random structure and then have a model to reverse this process to generate a novel material. Conceptually, it is similar to using a diffusion model for images, but we build a lot of inductive bias like equivariance and periodicity into the model because we’re operating on a sparse data region as in most scientific domains.

Given this diffusion architecture, we train the base model of MatterGen using the structure of all known stable materials. Once trained, we can generate novel, stable materials by sampling from the base model unconditionally. To generate the material given desired conditions, we further fine-tune this base model by adding conditions to each layer of the network using a ControlNet-style parameter-efficient fine-tuning approach. The condition can be anything like a specific chemistry, symmetry, or any target property. Once fine-tuned, the model can directly generate the materials given desired conditions. Since we use fine-tuning, we only need a small labeled dataset to generate the materials given the corresponding condition, which is actually very useful for the users because it’s usually computationally expensive to generate a property-labeled dataset for materials.

Here’s an example of how MatterGen generates novel materials in the strontium-vanadium- oxygen chemical system. It generates candidates with lower energy than two other competing methods: random structure search and substitution. The resulting structure looks very reasonable and is proven to be stable using computational methods. MatterGen also generates materials given desired magnetic, electronic, and mechanical properties. The most impressive result here is that we can shift the distribution of generated material towards extreme values compared with training property. This is very significant because most of the materials design problem involves finding materials with extreme properties, like finding superhard materials, magnets with high magnetism, which is difficult to do with traditional search-based methods and is the key advantage of generative models.

Our major next step is to bring this generative AI–designed materials into the real life, making real-world impact in a variety of domains like battery design, solar cell design, and carbon capture. One limitation is that we only have validated this AI-generated materials using computation. We’re working with experimental partners to synthesize them in the wet lab. It is a nontrivial process, but we keep improving our model, getting feedbacks from the experimentalist, and we are looking forward to a future where generative AI–designed materials can make real-world impact in a broad range of domains. Here’s a link to our paper in case you want to learn more about the details. We look forward to any comments and feedbacks that you might have. Thank you very much.


MatterGen: Designing materials with generative AI 

By Tian Xie

MatterGen, a model developed by Microsoft Research AI for Science, applies generative AI to materials design.

Why is this important?

Materials design is the cornerstone of modern technology. Many of the challenges our society is facing today are bottlenecked by scientists’ inability to find good materials that can unlock solutions. If we can find a novel material that conducts lithium ion extremely well, for example, it will be a key component of next-generation battery technology. The same applies to many other domains, like finding novel materials for solar cells, carbon capture, and quantum computers. 

Traditionally, materials design is conducted by search-based methods. We search through a list of candidates and gradually filter them down with a list of design requirements for the application. With batteries, for example, we need the material to contain lithium, to be stable, to have high lithium-ion conductivity, and so on. Each filtering step can be conducted using quantum mechanical simulations or AI emulators. Finally, we end up with 5-10 candidates that can be sent to the lab for experimental synthesis.

In MatterGen, we hope to rethink this process using generative AI. We aim to directly generate materials, given the design requirements for the target application, bypassing the tedious process of searching through a large number of candidates. You can think of it as using text-image generative models like DALLE to generate images given a detailed prompt, rather than using a search engine to scour the entire Internet for specific images.

The core of MatterGen is a diffusion model specifically designed for materials. A material can be represented by its unit cell, the smallest repeating unit of the infinite periodic structure. It has three components: atom types; atom positions; and the periodic lattice. We design the forward process to corrupt all three components toward a random material, and then train a model to reverse the corruption process to generate novel materials. Conceptually, it is similar to a diffusion model for images, but we build a lot of inductive bias, like equivariance and periodicity, into the model because we operate on the sparse data region as in most scientific domains.

Given this diffusion architecture, we train the base model of MatterGen using the structure of all known stable materials. Once the model is trained, we can generate novel, stable materials by sampling from the base model unconditionally. 

To generate materials given the desired conditions, we further fine-tune this base model by adding conditions to each layer of the network, using a ControlNet-style parameter efficient fine-tuning approach. The conditions can be anything, like a specific chemistry, symmetry, or any target property. Once fine-tuned, the model can directly generate materials given desired conditions. Since we use fine-tuning, we only need a small labeled material dataset to generate materials with the corresponding condition, which is very useful for users, because it is often computationally expensive to generate property labels for materials.

Here is an example of how MatterGen generates novel materials in the Sr-V-O (Strontium-Vanadium-Oxygen) chemical system. It generates candidates with lower energy than two other competing methods: random structure search and substitution. The resulting structures look quite reasonable and are proven to be stable using computational methods. 

MatterGen can also generate materials given desired magnetic, electronic, and mechanical properties. The most impressive result here is that we can shift the distribution of generated materials toward extreme values compared with training property distribution. This is very significant, because most materials design problems involve finding materials with extreme properties, such as finding super hard materials or magnets with high magnesium, which is difficult with traditional search-based methods. 

Our next major step is to use these generative AI designed materials to make real-world impacts in a variety of domains, such as battery design, solar-cell design, and carbon capture. One limitation is that we have only validated these AI-generated materials with computation. We are working with experimental partners to synthesize them in the lab. This is not a trivial process, but we will keep improving our models with feedback from the experimentalists. We look forward to a future where generative AI can disrupt the current materials design process and find revolutionary materials that can positively change everyone’s life.


The post MatterGen: A Generative Model for Materials Design appeared first on Microsoft Research.

]]>
Introducing CliffordLayers: Neural Network layers inspired by Clifford / Geometric Algebras.  http://approjects.co.za/?big=en-us/research/articles/introducing-cliffordlayers-neural-network-layers-inspired-by-clifford-geometric-algebras/ Thu, 09 Mar 2023 14:22:51 +0000 http://approjects.co.za/?big=en-us/research/?post_type=msr-blog-post&p=925635 We are open sourcing CliffordLayers (opens in new tab), a repo for building neural network layers inspired by Clifford / Geometric Algebras. This repo contains the source code of our ICLR 2023 paper Clifford Neural Layers for PDE Modeling (opens in new tab) and will be populated by the new geometric layers introduced in paper […]

The post Introducing CliffordLayers: Neural Network layers inspired by Clifford / Geometric Algebras.  appeared first on Microsoft Research.

]]>
CLiffordLayers graphic

We are open sourcing CliffordLayers (opens in new tab), a repo for building neural network layers inspired by Clifford / Geometric Algebras. This repo contains the source code of our ICLR 2023 paper Clifford Neural Layers for PDE Modeling (opens in new tab) and will be populated by the new geometric layers introduced in paper Geometric Clifford Algebra Networks (opens in new tab).  

Partial differential equations (PDEs) see widespread use in sciences and engineering to describe simulation of physical processes as scalar and vector fields interacting and coevolving over time. 

Below, we show an example of a scalar field – the pressure field – on the left, and a vector field – the wind velocity field – on the right. Those fields interact with each other. When pressure gradients occur, i.e., when the pressure field changes, the velocity field changes too. Consequently, net air movement from high to low pressure regions arises, which in turn affects the pressure field. Those two fields are intrinsically coupled, and it intuitively makes sense to treat them as one object instead of three different channel dimensions.   

Exemplary scalar and vector fields. Vector components of the wind velocities (right) are strongly related, i.e. they form a vector field. Additionally, the wind vector field and the scalar pressure field (left) are related since the gradient of the pressure field causes air movement and subsequently influences the wind components.
Exemplary scalar and vector fields. Vector components of the wind velocities (right) are strongly related, i.e. they form a vector field. Additionally, the wind vector field and the scalar pressure field (left) are related since the gradient of the pressure field causes air movement and subsequently influences the wind components.  

Most current methods do not explicitly take into account the relationship between different fields and their internal components, which are often correlated. In this work, we propose to view the time evolution of such correlated fields through the lens of multivector fields. Multivector fields consist of scalar, vector, as well as higher-order components, such as bivectors and trivectors. Their algebraic properties, such as multiplication, addition and other arithmetic operations can be described by Clifford algebras. 

Multivector components of 3-dimensional Clifford algebras. In contrast to standard vector algebra, higher order objects such as bivectors and trivectors exist. All spatial primitives can be combined into one multivector. 
Multivector components of 3-dimensional Clifford algebras. In contrast to standard vector algebra, higher order objects such as bivectors and trivectors exist. All spatial primitives can be combined into one multivector. 

Clifford algebras allow us to extend operations such as convolutions and Fourier transforms over multivector fields. Consequently, we can build Clifford convolution and Clifford Fourier transform layers. The resulting Clifford neural layers are universally applicable and will find direct use in the areas of fluid dynamics, weather forecasting, and the modeling of physical systems in general. 

Sketch of 2-dimensional Clifford convolution (left) and Clifford Fourier transform layers (right). For Clifford convolution, multivector input fields are convolved with multivector kernels. The Clifford Fourier transforms uses the dual structure of multivectors and performs Fast Fourier Transforms (FFTs) over the dual parts.
Sketch of 2-dimensional Clifford convolution (left) and Clifford Fourier transform layers (right). For Clifford convolution, multivector input fields are convolved with multivector kernels. The Clifford Fourier transforms uses the dual structure of multivectors and performs Fast Fourier Transforms (FFTs) over the dual parts. 

We empirically evaluate the benefit of these layers by replacing convolution and Fourier operations in common neural PDE surrogates by their Clifford counterparts on two-dimensional Navier-Stokes and weather modeling tasks, as well as three-dimensional Maxwell equations. Clifford neural layers consistently improve generalization capabilities of the tested neural PDE surrogates. Probably the most notable performance improvement is observed when modeling Maxwell’s equations in 3 dimensions, which describe the evolution of electric and magnetic field over time. We can treat these 6 field components (three field components each) as one multivector field, and thus operate directly on this multivector field via 3-dimensional Clifford convolutions and Clifford Fourier transforms.  

Results on the Maxwell equations obtained by Fourier based architectures, i.e., Fourier Neural Operators (FNO) and Clifford Fourier Neural Operators (CFNO). Results are shown for next frame predictions, rollout loss, and displacement field D and magnetization field H. Models are trained on four training sets with increasing number of trajectories. 
Results on the Maxwell equations obtained by Fourier based architectures, i.e., Fourier Neural Operators (FNO) and Clifford Fourier Neural Operators (CFNO). Results are shown for next frame predictions, rollout loss, and displacement field D and magnetization field H. Models are trained on four training sets with increasing number of trajectories. 

This work is being undertaken by members of  Microsoft Research AI4Science  & Microsoft Autonomous Systems and Robotics Research (opens in new tab),  Johannes Brandstetter (opens in new tab),  Rianne van den Berg (opens in new tab), Max Welling (opens in new tab) & Jayesh K. Gupta (opens in new tab) 

The post Introducing CliffordLayers: Neural Network layers inspired by Clifford / Geometric Algebras.  appeared first on Microsoft Research.

]]>