Research Forum | Episode 3 - abstract chalkboard background with colorful hands

Research Forum Brief | June 2024

MatterGen: A Generative Model for Materials Design

Share this page

Tian Xie

“Materials design is the cornerstone of modern technology. Many of the challenges our society is facing today are bottlenecked by finding a good material. … If we can find a novel material that conducts lithium very well, it will be a key component for our next-generation battery technology. The same applies to many other domains.”

Tian Xie, Principal Research Manager, Microsoft Research AI for Science

Transcript: Lightning Talk

MatterGen: A Generative Model for Materials Design

Tian Xie, Principal Research Manager, Microsoft Research AI for Science

Tian Xie introduces MatterGen, a generative model that creates new inorganic materials based on a broad range of property conditions required by the application, aiming to shift the traditional paradigm of materials design with generative AI.

Microsoft Research Forum, June 4, 2024

TIAN XIE: Hello, everyone. My name is Tian, and I’m from Microsoft Research AI for Science. I’m excited to be here to share with you MatterGen, our latest model that brings generative AI to materials design.

Materials design is the cornerstone of modern technology. Many of the challenges our society is facing today are bottlenecked by finding a good material. For example, if we can find a novel material that conducts lithium very well, it will be a key component for our next-generation battery technology. The same applies to many other domains, like finding a novel material for solar cells, carbon capture, and quantum computers. Traditionally, materials design is conducted by search-based methods. We search through a list of candidates and gradually filter them using a list of design criteria for the application. Like for batteries, we need the materials to contain lithium, to be stable, to have a high lithium-ion conductivity, and each filtering step can be conducted using simulation-based methods or AI emulators. At the end, we get five to 10 candidates that we’re sending to the lab for experimental synthesis.

In MatterGen, we hope to rethink this process with generative AI. We’re aiming to directly generate materials given the design requirements for the target application, bypassing the process of searching through candidates. You can think of it as using text-to-image generative models like DALL-E to generate the images given a prompt rather than needing to search through the entire internet for images via a search engine. The core of MatterGen is a diffusion model specifically designed for materials. A material can be represented by its unit cell, the smallest repeating unit of the infinite periodic structure. It has three components: atom types, atom positions, and periodic lattice. We designed the forward process to corrupt all three components towards a random structure and then have a model to reverse this process to generate a novel material. Conceptually, it is similar to using a diffusion model for images, but we build a lot of inductive bias like equivariance and periodicity into the model because we’re operating on a sparse data region as in most scientific domains.

Given this diffusion architecture, we train the base model of MatterGen using the structure of all known stable materials. Once trained, we can generate novel, stable materials by sampling from the base model unconditionally. To generate the material given desired conditions, we further fine-tune this base model by adding conditions to each layer of the network using a ControlNet-style parameter-efficient fine-tuning approach. The condition can be anything like a specific chemistry, symmetry, or any target property. Once fine-tuned, the model can directly generate the materials given desired conditions. Since we use fine-tuning, we only need a small labeled dataset to generate the materials given the corresponding condition, which is actually very useful for the users because it’s usually computationally expensive to generate a property-labeled dataset for materials.

Here’s an example of how MatterGen generates novel materials in the strontium-vanadium- oxygen chemical system. It generates candidates with lower energy than two other competing methods: random structure search and substitution. The resulting structure looks very reasonable and is proven to be stable using computational methods. MatterGen also generates materials given desired magnetic, electronic, and mechanical properties. The most impressive result here is that we can shift the distribution of generated material towards extreme values compared with training property. This is very significant because most of the materials design problem involves finding materials with extreme properties, like finding superhard materials, magnets with high magnetism, which is difficult to do with traditional search-based methods and is the key advantage of generative models.

Our major next step is to bring this generative AI–designed materials into the real life, making real-world impact in a variety of domains like battery design, solar cell design, and carbon capture. One limitation is that we only have validated this AI-generated materials using computation. We’re working with experimental partners to synthesize them in the wet lab. It is a nontrivial process, but we keep improving our model, getting feedbacks from the experimentalist, and we are looking forward to a future where generative AI–designed materials can make real-world impact in a broad range of domains. Here’s a link to our paper in case you want to learn more about the details. We look forward to any comments and feedbacks that you might have. Thank you very much.