Research Forum | Episode 3 - abstract chalkboard background with colorful hands

Research Forum Brief | June 2024

MatterGen: A Generative Model for Materials Design

Share this page

Tian Xie

“Materials design is the cornerstone of modern technology. Many of the challenges our society is facing today are bottlenecked by finding a good material. … If we can find a novel material that conducts lithium very well, it will be a key component for our next-generation battery technology. The same applies to many other domains.”

Tian Xie, Principal Research Manager, Microsoft Research AI for Science

Transcript: Lightning Talk

MatterGen: A Generative Model for Materials Design

Tian Xie, Principal Research Manager, Microsoft Research AI for Science

Tian Xie introduces MatterGen, a generative model that creates new inorganic materials based on a broad range of property conditions required by the application, aiming to shift the traditional paradigm of materials design with generative AI.

Microsoft Research Forum, June 4, 2024

TIAN XIE: Hello, everyone. My name is Tian, and I’m from Microsoft Research AI for Science. I’m excited to be here to share with you MatterGen, our latest model that brings generative AI to materials design.

Materials design is the cornerstone of modern technology. Many of the challenges our society is facing today are bottlenecked by finding a good material. For example, if we can find a novel material that conducts lithium very well, it will be a key component for our next-generation battery technology. The same applies to many other domains, like finding a novel material for solar cells, carbon capture, and quantum computers. Traditionally, materials design is conducted by search-based methods. We search through a list of candidates and gradually filter them using a list of design criteria for the application. Like for batteries, we need the materials to contain lithium, to be stable, to have a high lithium-ion conductivity, and each filtering step can be conducted using simulation-based methods or AI emulators. At the end, we get five to 10 candidates that we’re sending to the lab for experimental synthesis.

In MatterGen, we hope to rethink this process with generative AI. We’re aiming to directly generate materials given the design requirements for the target application, bypassing the process of searching through candidates. You can think of it as using text-to-image generative models like DALL-E to generate the images given a prompt rather than needing to search through the entire internet for images via a search engine. The core of MatterGen is a diffusion model specifically designed for materials. A material can be represented by its unit cell, the smallest repeating unit of the infinite periodic structure. It has three components: atom types, atom positions, and periodic lattice. We designed the forward process to corrupt all three components towards a random structure and then have a model to reverse this process to generate a novel material. Conceptually, it is similar to using a diffusion model for images, but we build a lot of inductive bias like equivariance and periodicity into the model because we’re operating on a sparse data region as in most scientific domains.

Given this diffusion architecture, we train the base model of MatterGen using the structure of all known stable materials. Once trained, we can generate novel, stable materials by sampling from the base model unconditionally. To generate the material given desired conditions, we further fine-tune this base model by adding conditions to each layer of the network using a ControlNet-style parameter-efficient fine-tuning approach. The condition can be anything like a specific chemistry, symmetry, or any target property. Once fine-tuned, the model can directly generate the materials given desired conditions. Since we use fine-tuning, we only need a small labeled dataset to generate the materials given the corresponding condition, which is actually very useful for the users because it’s usually computationally expensive to generate a property-labeled dataset for materials.

Here’s an example of how MatterGen generates novel materials in the strontium-vanadium- oxygen chemical system. It generates candidates with lower energy than two other competing methods: random structure search and substitution. The resulting structure looks very reasonable and is proven to be stable using computational methods. MatterGen also generates materials given desired magnetic, electronic, and mechanical properties. The most impressive result here is that we can shift the distribution of generated material towards extreme values compared with training property. This is very significant because most of the materials design problem involves finding materials with extreme properties, like finding superhard materials, magnets with high magnetism, which is difficult to do with traditional search-based methods and is the key advantage of generative models.

Our major next step is to bring this generative AI–designed materials into the real life, making real-world impact in a variety of domains like battery design, solar cell design, and carbon capture. One limitation is that we only have validated this AI-generated materials using computation. We’re working with experimental partners to synthesize them in the wet lab. It is a nontrivial process, but we keep improving our model, getting feedbacks from the experimentalist, and we are looking forward to a future where generative AI–designed materials can make real-world impact in a broad range of domains. Here’s a link to our paper in case you want to learn more about the details. We look forward to any comments and feedbacks that you might have. Thank you very much.


MatterGen: Designing materials with generative AI 

By Tian Xie

MatterGen, a model developed by Microsoft Research AI for Science, applies generative AI to materials design.

Why is this important?

Materials design is the cornerstone of modern technology. Many of the challenges our society is facing today are bottlenecked by scientists’ inability to find good materials that can unlock solutions. If we can find a novel material that conducts lithium ion extremely well, for example, it will be a key component of next-generation battery technology. The same applies to many other domains, like finding novel materials for solar cells, carbon capture, and quantum computers. 

Traditionally, materials design is conducted by search-based methods. We search through a list of candidates and gradually filter them down with a list of design requirements for the application. With batteries, for example, we need the material to contain lithium, to be stable, to have high lithium-ion conductivity, and so on. Each filtering step can be conducted using quantum mechanical simulations or AI emulators. Finally, we end up with 5-10 candidates that can be sent to the lab for experimental synthesis.

In MatterGen, we hope to rethink this process using generative AI. We aim to directly generate materials, given the design requirements for the target application, bypassing the tedious process of searching through a large number of candidates. You can think of it as using text-image generative models like DALLE to generate images given a detailed prompt, rather than using a search engine to scour the entire Internet for specific images.

The core of MatterGen is a diffusion model specifically designed for materials. A material can be represented by its unit cell, the smallest repeating unit of the infinite periodic structure. It has three components: atom types; atom positions; and the periodic lattice. We design the forward process to corrupt all three components toward a random material, and then train a model to reverse the corruption process to generate novel materials. Conceptually, it is similar to a diffusion model for images, but we build a lot of inductive bias, like equivariance and periodicity, into the model because we operate on the sparse data region as in most scientific domains.

Given this diffusion architecture, we train the base model of MatterGen using the structure of all known stable materials. Once the model is trained, we can generate novel, stable materials by sampling from the base model unconditionally. 

To generate materials given the desired conditions, we further fine-tune this base model by adding conditions to each layer of the network, using a ControlNet-style parameter efficient fine-tuning approach. The conditions can be anything, like a specific chemistry, symmetry, or any target property. Once fine-tuned, the model can directly generate materials given desired conditions. Since we use fine-tuning, we only need a small labeled material dataset to generate materials with the corresponding condition, which is very useful for users, because it is often computationally expensive to generate property labels for materials.

Here is an example of how MatterGen generates novel materials in the Sr-V-O (Strontium-Vanadium-Oxygen) chemical system. It generates candidates with lower energy than two other competing methods: random structure search and substitution. The resulting structures look quite reasonable and are proven to be stable using computational methods. 

MatterGen can also generate materials given desired magnetic, electronic, and mechanical properties. The most impressive result here is that we can shift the distribution of generated materials toward extreme values compared with training property distribution. This is very significant, because most materials design problems involve finding materials with extreme properties, such as finding super hard materials or magnets with high magnesium, which is difficult with traditional search-based methods. 

Our next major step is to use these generative AI designed materials to make real-world impacts in a variety of domains, such as battery design, solar-cell design, and carbon capture. One limitation is that we have only validated these AI-generated materials with computation. We are working with experimental partners to synthesize them in the lab. This is not a trivial process, but we will keep improving our models with feedback from the experimentalists. We look forward to a future where generative AI can disrupt the current materials design process and find revolutionary materials that can positively change everyone’s life.