I recently attended a small materials + AI conference that pulled together academics, materials industry executives, former cabinet members, startup founders, and military officers. This post summarizes how my thinking on AI + materials science has updated based on those conversations and my own research. Full disclosure: I am not an expert in this space. I cold-emailed my way into a conversation with one of the organizers, and he thought some research I did on self-driving labs was interesting enough to warrant an invite.
One central question was “what is the AlphaFold of materials science, and how can we get it in 6-24 months?” This is an ill-posed question because proteins have a bunch of nice properties that materials don’t have. Namely, amino acid sequences are basically sufficient to determine structure in physiological conditions of interest, but composition or even lattice unit cells are not sufficient to determine structure of Real Physical Materials. Despite what everyone wants to believe based on standard bulk modeling techniques, most materials are disordered and have complex interfaces, and if you can’t capture the implications of these, models will output garbage.
“God made the bulk; the surface was invented by the devil” - Wolfgang Pauli
Proteins also have only 4 discrete scales. You have the primary structure (amino acid sequence), secondary structure (alpha helices, beta sheets, etc), tertiary structure (larger folding properties), and quaternary structure (multi-subunit interactions). Some proteins are quite large, but these pale in comparison to materials science problems where you have 8 orders of magnitude from angstrom-scale interatomic potentials to centimeter-scale fracture mechanics. In other words, there are a lot more degrees of freedom to fully describe a physical-scale material, so even finding the right abstraction of a material to “tokenize” is unsolved.
The challenge of building something like the Protein Data Bank (PDB) is compounded by the fact there are just so many properties of materials you can measure and care about, all of which differ across length scales and are heavily dependent on external conditions like temperature and stress. Of course, AlphaFold does not attempt to address all possible questions about a protein given its amino acid sequence, but the point is that the general materials problem is significantly more complex along most dimensions.
Another main difference is that while proteins have one fabrication method (you feed mRNA to a ribosome and out comes your polypeptide), material properties are heavily dependent on both composition and fabrication techniques. Basically every fabrication step of materials is noncommutative with respect to the final figures of merit, and these processes must be tediously optimized through the combinatorially exploding landscape of orderings and parameters. A general purpose materials model needs to have an understanding of both the material properties on all scales and the path-dependent steps that lead to these properties.
Collecting good data for this sort of thing is further compounded by minute differences in recipe implementations. Sharing data across labs is almost meaningless for some problems because vacuum chambers get dirty and calibrations drift. These problems blow out repeatability metrics even before high-throughput growth/characterization which swaps in other materials causing further contamination.
Data is King
Given all of these complications, the path forward requires massive amounts of high-quality experimental data. But there is no PDB analog for materials, and building one is a much harder problem than most AI-first companies seem to appreciate. The only database that comes close is the Materials Project which has density functional theory (DFT) simulations of thousands of materials, but DFT is often wrong in important ways.
This is germanium’s band structure as computed by DFT on the Materials Project website. Germanium is a semiconductor, but there is no band gap present here - it’s showing up as a conductor which is obviously wrong.
During my coffee chatting before the conference, I talked to various employees at labs working on “AI + Science”, and their backgrounds are for the most part SWE and ML. Their research tastes and optimism have been shaped in the world of strong scaling laws continuing for more than a decade, and they more or less believe a software-first approach with a little physical verification on the side will be able to Solve Science.
The ethos of “feed the datahungry machine and let it rip” that has led to such impressive emergent behavior in text-based AI stands in stark contrast to the prevailing data-starved models in physical materials. Bayesian models are the norm for models interfacing with experimental systems because of the time+money cost of data acquisition, but these models must be given nontrivial physical priors to make models converge on relatively little data. Techniques from robotics like high-quality simulation + RL are not applicable because robotics is a control problem for well-characterized physics while the question here is a control problem for poorly understood multiscale interactions.
There is evidence to believe the most well-funded startups and companies in the space are not datapilled. Periodic Labs, Lila Sciences, and FutureHouse are hiring people to build automated experimental systems, but the ratio of ML engineer to experimentalist job postings does not really indicate they really believe data is the bottleneck. Google DeepMind has only just now started posting jobs for their automated labs in London, but this has been a very slow update for them. They wrote the GNoME paper in 2023 that claimed to discover hundreds of thousands of new materials, but then to my knowledge not a single new material emerged.
GNoME’s computational pipeline
My understanding of these companies’ strategy is to do broad but cheap simulation, perform more expensive simulation on a subset of candidates, and then physically synthesize and validate a still smaller subset. It seems their bet is still to focus on the first two steps of the chain because it only involves shuffling bits around, and that’s what they’re experts in.
Yet, there is no real equivalent of “scraping the internet” that justifies this approach to me. The most comprehensive materials datasets in the world are almost surely locked away at TSMC, 3M, DuPont, and the like. The solution that datapilled startups and nonprofits are converging on is to make “data factories” that make and characterize as many materials as possible in a standardized environment. These include Materials Data Factory, Mattiq, Radical, and Dunia. Unfortunately, they have much less funding than the ML-pilled companies. This sort of data production, standardization, and hyperscaling project seems to be what the national labs are well-suited for, but they will not have the budgets of some of the private companies; autonomous labs are but one small piece of the recently-announced Genesis Mission.
I am most confident this data factory approach will lead to the kind of general purpose physical materials model that can actually design new materials and manufacturing methods with just a prompt of specifications. Even then, I think it’s quite far off because of the sheer amount of high quality data that must be collected, and it would not surprise me if it takes decades to have a “AlphaFold for materials” moment.
No one is feeling the physical AGI
The overwhelming takeaway from talking to fellow attendees was that no one really has a plausible path to a general materials model, and full general materials intelligence will come long after a software-only AGI. These are not AI pessimists in the slightest, and most are active users of both classical ML for experimental data and text based AI. Some speakers even floated the idea that materials research is going to be “solved” not by some physics god model or fully automated labs but just intelligent humanoid robots massively scaling the current research paradigm.
Even if a materials model comes out tomorrow that can give a fully optimized recipe in response to “make me a new semiconductor manufacturing process that beats TSMC, make no mistakes”, others emphasized how slow the adoption will be across some industries. New parts for the military and other safety-focused industries must go through extensive qualification processes that take years and tens of millions of dollars to pass all the performance and longevity tests.
And none of this is to mention the complications from supply chain shortages and manufacturing scale-up once you’ve identified a new material and killer application. Interesting new devices are often made from rare earths and other materials that are hard to source, and it is not possible for many precursor supply chains to absorb 10x demand overnight (just look at InP wafers). Once all of these problems have been solved, production must then scale by 4-10 orders of magnitude to make a dent in existing markets. Hyperscaling R&D into industrial manufacturing is a skill that has historically taken decades of expertise accumulation, and it’s not clear how AI will speed up many of the bottlenecks in the physical world like cultivating robust supply chains, building new factories, and optimizing yield in N-of-1 facilities.
Takeaways
I do not think the materials science community is going to get its “AlphaFold moment” anytime soon. The problem is harder along almost every axis: more degrees of freedom, path-dependent fabrication, scarce and noisy data, and no universal representation scheme. The companies that seem best-positioned to make progress are the ones investing in standardized, high-throughput data generation or with the existing proprietary datasets, not the (well-funded) ones assuming they can simulate their way to breakthroughs with existing datasets.
And even after such a model exists, the pipeline from prompt to product is much less clear than merely ramping GPU capacity and building enormous but mostly identical datacenters. In general, I think this should update far away from any “scientific revolution every year with rapid-onset hyperabundance” kind of scenario. The space needs many more clear proposals for scaling in the real world, though it’s clearly willing to fund orgs that look promising.
Thanks to Charles Yang for helpful discussions. All opinions are my own.
I recently attended a small materials + AI conference that pulled together academics, materials industry executives, former cabinet members, startup founders, and military officers. This post summarizes how my thinking on AI + materials science has updated based on those conversations and my own research. Full disclosure: I am not an expert in this space. I cold-emailed my way into a conversation with one of the organizers, and he thought some research I did on self-driving labs was interesting enough to warrant an invite.
This is a linkpost for https://bosoncutter.substack.com/p/there-is-no-alphafold-for-materials
“AlphaFold, but for materials?”
One central question was “what is the AlphaFold of materials science, and how can we get it in 6-24 months?” This is an ill-posed question because proteins have a bunch of nice properties that materials don’t have. Namely, amino acid sequences are basically sufficient to determine structure in physiological conditions of interest, but composition or even lattice unit cells are not sufficient to determine structure of Real Physical Materials. Despite what everyone wants to believe based on standard bulk modeling techniques, most materials are disordered and have complex interfaces, and if you can’t capture the implications of these, models will output garbage.
Proteins also have only 4 discrete scales. You have the primary structure (amino acid sequence), secondary structure (alpha helices, beta sheets, etc), tertiary structure (larger folding properties), and quaternary structure (multi-subunit interactions). Some proteins are quite large, but these pale in comparison to materials science problems where you have 8 orders of magnitude from angstrom-scale interatomic potentials to centimeter-scale fracture mechanics. In other words, there are a lot more degrees of freedom to fully describe a physical-scale material, so even finding the right abstraction of a material to “tokenize” is unsolved.
The challenge of building something like the Protein Data Bank (PDB) is compounded by the fact there are just so many properties of materials you can measure and care about, all of which differ across length scales and are heavily dependent on external conditions like temperature and stress. Of course, AlphaFold does not attempt to address all possible questions about a protein given its amino acid sequence, but the point is that the general materials problem is significantly more complex along most dimensions.
Another main difference is that while proteins have one fabrication method (you feed mRNA to a ribosome and out comes your polypeptide), material properties are heavily dependent on both composition and fabrication techniques. Basically every fabrication step of materials is noncommutative with respect to the final figures of merit, and these processes must be tediously optimized through the combinatorially exploding landscape of orderings and parameters. A general purpose materials model needs to have an understanding of both the material properties on all scales and the path-dependent steps that lead to these properties.
Collecting good data for this sort of thing is further compounded by minute differences in recipe implementations. Sharing data across labs is almost meaningless for some problems because vacuum chambers get dirty and calibrations drift. These problems blow out repeatability metrics even before high-throughput growth/characterization which swaps in other materials causing further contamination.
Data is King
Given all of these complications, the path forward requires massive amounts of high-quality experimental data. But there is no PDB analog for materials, and building one is a much harder problem than most AI-first companies seem to appreciate. The only database that comes close is the Materials Project which has density functional theory (DFT) simulations of thousands of materials, but DFT is often wrong in important ways.
This is germanium’s band structure as computed by DFT on the Materials Project website. Germanium is a semiconductor, but there is no band gap present here - it’s showing up as a conductor which is obviously wrong.
During my coffee chatting before the conference, I talked to various employees at labs working on “AI + Science”, and their backgrounds are for the most part SWE and ML. Their research tastes and optimism have been shaped in the world of strong scaling laws continuing for more than a decade, and they more or less believe a software-first approach with a little physical verification on the side will be able to Solve Science.
The ethos of “feed the datahungry machine and let it rip” that has led to such impressive emergent behavior in text-based AI stands in stark contrast to the prevailing data-starved models in physical materials. Bayesian models are the norm for models interfacing with experimental systems because of the time+money cost of data acquisition, but these models must be given nontrivial physical priors to make models converge on relatively little data. Techniques from robotics like high-quality simulation + RL are not applicable because robotics is a control problem for well-characterized physics while the question here is a control problem for poorly understood multiscale interactions.
There is evidence to believe the most well-funded startups and companies in the space are not datapilled. Periodic Labs, Lila Sciences, and FutureHouse are hiring people to build automated experimental systems, but the ratio of ML engineer to experimentalist job postings does not really indicate they really believe data is the bottleneck. Google DeepMind has only just now started posting jobs for their automated labs in London, but this has been a very slow update for them. They wrote the GNoME paper in 2023 that claimed to discover hundreds of thousands of new materials, but then to my knowledge not a single new material emerged.
GNoME’s computational pipeline
My understanding of these companies’ strategy is to do broad but cheap simulation, perform more expensive simulation on a subset of candidates, and then physically synthesize and validate a still smaller subset. It seems their bet is still to focus on the first two steps of the chain because it only involves shuffling bits around, and that’s what they’re experts in.
Yet, there is no real equivalent of “scraping the internet” that justifies this approach to me. The most comprehensive materials datasets in the world are almost surely locked away at TSMC, 3M, DuPont, and the like. The solution that datapilled startups and nonprofits are converging on is to make “data factories” that make and characterize as many materials as possible in a standardized environment. These include Materials Data Factory, Mattiq, Radical, and Dunia. Unfortunately, they have much less funding than the ML-pilled companies. This sort of data production, standardization, and hyperscaling project seems to be what the national labs are well-suited for, but they will not have the budgets of some of the private companies; autonomous labs are but one small piece of the recently-announced Genesis Mission.
I am most confident this data factory approach will lead to the kind of general purpose physical materials model that can actually design new materials and manufacturing methods with just a prompt of specifications. Even then, I think it’s quite far off because of the sheer amount of high quality data that must be collected, and it would not surprise me if it takes decades to have a “AlphaFold for materials” moment.
No one is feeling the physical AGI
The overwhelming takeaway from talking to fellow attendees was that no one really has a plausible path to a general materials model, and full general materials intelligence will come long after a software-only AGI. These are not AI pessimists in the slightest, and most are active users of both classical ML for experimental data and text based AI. Some speakers even floated the idea that materials research is going to be “solved” not by some physics god model or fully automated labs but just intelligent humanoid robots massively scaling the current research paradigm.
Even if a materials model comes out tomorrow that can give a fully optimized recipe in response to “make me a new semiconductor manufacturing process that beats TSMC, make no mistakes”, others emphasized how slow the adoption will be across some industries. New parts for the military and other safety-focused industries must go through extensive qualification processes that take years and tens of millions of dollars to pass all the performance and longevity tests.
And none of this is to mention the complications from supply chain shortages and manufacturing scale-up once you’ve identified a new material and killer application. Interesting new devices are often made from rare earths and other materials that are hard to source, and it is not possible for many precursor supply chains to absorb 10x demand overnight (just look at InP wafers). Once all of these problems have been solved, production must then scale by 4-10 orders of magnitude to make a dent in existing markets. Hyperscaling R&D into industrial manufacturing is a skill that has historically taken decades of expertise accumulation, and it’s not clear how AI will speed up many of the bottlenecks in the physical world like cultivating robust supply chains, building new factories, and optimizing yield in N-of-1 facilities.
Takeaways
I do not think the materials science community is going to get its “AlphaFold moment” anytime soon. The problem is harder along almost every axis: more degrees of freedom, path-dependent fabrication, scarce and noisy data, and no universal representation scheme. The companies that seem best-positioned to make progress are the ones investing in standardized, high-throughput data generation or with the existing proprietary datasets, not the (well-funded) ones assuming they can simulate their way to breakthroughs with existing datasets.
And even after such a model exists, the pipeline from prompt to product is much less clear than merely ramping GPU capacity and building enormous but mostly identical datacenters. In general, I think this should update far away from any “scientific revolution every year with rapid-onset hyperabundance” kind of scenario. The space needs many more clear proposals for scaling in the real world, though it’s clearly willing to fund orgs that look promising.
Thanks to Charles Yang for helpful discussions. All opinions are my own.