Maria Chan on ML for Material Structures

February 12, 2025 00:23:30
Maria Chan on ML for Material Structures
Carry the Two
Maria Chan on ML for Material Structures

Feb 12 2025 | 00:23:30

/

Show Notes

Welcome to Carry the Two, the podcast about how math and statistics impact the world around us from the Institute for Mathematical and Statistical Innovation. While we’re in between our more in-depth seasons, we like to bring you something a little different in mini-season format. And for this mini season, we are going to highlight some of the amazing researchers who have presented at IMSI over the past year. Our second guest is Maria Chan,a scientist at Argonnne National Laboratory working at the Center for Nanoscale Materials who focuses on computational research involving materials in chemistry using a combination of physics, artificial intelligence and machine learning. Maria joined us at IMSI for a workshop on Machine Learning in Electronic Structure Theory where she presented a talk titled Theory-informed AI/ML for Microscopy & Spectroscopy. Host Sam Hansen joined Maria for a talk about the research in her talk and Maria’s time at IMSI. 

Find our transcript here: Google Doc or .txt file

Curious to learn more? Check out these additional links:

Maria Chan

IMSI Talk: Theory-informed AI/ML for Microscopy & Spectroscopy

Follow more of IMSI’s work: www.IMSI.institute, (twitter) @IMSI_institute, (mastodon) https://sciencemastodon.com/@IMSI, (instagram) IMSI.institute

Music by Blue Dot Sessions

The Institute for Mathematical and Statistical Innovation (IMSI) is funded by NSF grant DMS-1929348

View Full Transcript

Episode Transcript

SH: Hello it is your host Sam Hansen and I am excited to welcome you back to Carry the Two, the podcast about how math and statistics impact the world around us from the Institute for Mathematical and Statistical Innovation. While we’re in between our more in-depth seasons, we like to bring you something a little different in mini-season format. And for this mini season, we are going to highlight some of the amazing researchers who have presented at IMSI over the past year. Our second guest is MC: My name is Maria Chan. I am a scientist at Argonnne National Laboratory working at the Center for Nanoscale Materials. I do computational research involving materials in chemistry and I use a combination of physics, artificial intelligence and machine learning, and computational simulations to try to understand materials at the levels of atoms. SH: Maria joined us at IMSI for a workshop on Machine Learning in Electronic Structure Theory where she presented a talk titled Theory-informed Artificial Intelligence and Machine Learning for Microscopy & Spectroscopy. And now without any further ado let’s get into my conversation with Maria Chan SH: Would it be possible for you to let me know why it is really important, especially on the nanoscale, to be able to understand the structure of things? MC: Structure of materials and chemicals determine their properties. Whether it is a solar cell or a battery or a catalyst, how they perform their functions completely depends on the chemistry and structure. A very common examples that we teach schoolchildren is when you have carbon atoms arranged in hexagonal sheets and the sheets slide against each other, you get graphite, which is also pencil lead. Whereas when you put the carbon atoms, the same carbon atoms in a way where they're tetrahedrally or in four-coordinated, with each other, you get diamond. So the same atoms arranged differently actually make a huge difference in their property and whether they can be useful for our applications. SH: I mean, yeah, graphite and a diamond are definitely very different substances, even if they are made from the same atoms. So what are some of the ways that these structures have been determined throughout time, especially in the past? MC: This is a really good question because until about a hundred years ago, we have no ideas. Until a hundred years ago, we didn't even know if atoms were really, really there. And this change with the sort of x-ray spectroscopy where scientists, after being able to generate x-ray, promptly shine it through crystals and then realize that we can have some idea of the crystal structures using x-rays. So this x-ray is one with the oldest history in terms of being able to determine atomic structures. Of course, recently, x-ray allow us to determine the structure of the spike protein on the COVID coronavirus, which really helps us come up with a way to combat that disease. So x-ray structure, x-ray crystallography has been in use and to many great extent. In addition, more recently, we have this tool called electron microscopy where electron beams instead of x-ray beams are shown through materials. And using that, we can actually see the arrangement of atoms. And there's a much more direct way to observe the atoms and it's really fascinating if you realize that the images that you see on screen actually rolls and columns of atoms. So those are two very major ways we can look at atoms. There is in addition many other types of techniques using either little little tiny tips of atoms called atom probe type methods or other somewhat more indirect methods. Because when atoms are bonded to each other in different ways, they give off different signals. So there are things that are like called vibrational spectroscopy because when you bind atoms differently, they shake in different ways and give off different frequencies. So all of these are ways that you can measure how atoms are arranged or at least make some determination of how atoms are arranged. But in addition, it wouldn't be fair to my discipline if I don't discuss the way that we actually predict how atomic structures are arranged in materials. Using computational methods solely from principles of quantum mechanics, scientists are now able to predict many different properties of atoms including what arrangements of atoms are actually more likely to be observed in nature. So we do a lot of computational work trying different arrangements of atoms and predicting their energies which tell us whether they are likely to exist or not. And using many artificial intelligence machine learning approaches to determine statistically which are likely arrangements of atoms. One of the interesting aspect is that both the measurement side, so doing experiments with x-rays or electrons, or the computational side, purely predicting using quantum mechanics are imperfect. Oftentimes the experiments don't give you all the information that you need and the simulations have the advantage of coming from first principles so you can sort of trust their reliability most of the time, but you don't actually know if they are really realized. So this is why it's one of my main research goals to really combine those two approaches, to really try to use information both from first principles from quantum mechanics and also from experimental observations to try to help these two methods that are imperfect each on their own to complement each other so that we can have better information. SH: Would it be possible for you to share some of the ways these techniques are imperfect? MC: So for example I talked about electron microscopy where you shine a beam of electrons through the material because you are projecting the electron beam through the many layers of materials, you are only seeing an aggregate of signal. So there are some mixing of different types of atoms along the line, you may not be able to tell directly. Also any kind of experiments will have noise if your material is not a perfect crystal which really doesn't exist in nature. There will be noise and uncertainties with the structural model that correspond to the signal. In some other cases the uncertainty really has to do with the fact that the signal is many to one. Many different structures can give you very similar or even almost identical vibrational spectroscopy signal or other signals. So in a way mathematically we will think of those measurements as being under-determined. SH: You mentioned that your approaches tend to combine both this observational, experimental and the computational. So could you share a little bit more about how you have combined these things and how it has helped better determine these structures? MC: Yeah absolutely. So fundamentally the problem that we are facing is that the space and now since we're talking mathematically the space of possible atom positions is too large. If you have an atom you put one or two atoms in a box, they can be anywhere in the box, right? So essentially there are infinitely many possibilities for that and now we don't usually deal with one or two atoms, we deal with things anywhere from 30 to 2000 atoms or so. So the possibilities are just huge. So what we need to do is a way to sample the possibilities and discriminate which possibilities are more likely than others. So we use mathematical approaches, statistical approaches to perform that sampling. So try different things but not randomly using a biased approach. Once you determine what are some possible arrangements of atoms, we ask two questions. First is this arrangement of atoms consistent with what we observe in experiments? So we have to be able to determine, you know, this arrangement of atoms will give you this x-ray signal or this electron microscopy signal or this vibrational spectroscopy signal. And then two, we have to ask the question whether this arrangement of atoms is physically likely given the energy that it will have. So that we evaluate using mostly quantum mechanical approach or a proxy to the quantum mechanical approach and that tells us, okay this arrangement of atoms is really likely or not likely. Sometimes an arrangement of atoms is not likely because some atoms are too close together and they don't usually like to be that close. Sometimes they're not bonded correctly. You learn all this high school chemistry about bonding, right? There are certain bonding that satisfies the coshell requirements, for example, so that the energy is lower and, you know, in sort of common language you say it's happier. So we look at whether such arrangements of atoms give rise to the observed signal and then we look at, or the other way around really, we look at whether arrangements of atoms have low energy and are therefore physically possible and likely. We look at the observations and combining the two we either accept or reject these suggestions of atom arrangements and we go through a lot of them. So this is really the biggest challenge is we still have to go through a lot of them and go through these two steps. But the upside is that we get much more confidence in what we are actually observing and the fortunate aspect of this is that, you know, all of this can be done in parallel on large-scale supercomputers, which we have access to. SH: So in what ways have machine learning techniques allowed you to move forward with this work? MC: Machine learning and artificial intelligence has been integral to the progress that we can make in this area. There's a lot of aspects to it. For example, being able to predict the properties of interest, the measurements that we will measure from any atomic arrangement that can be sped up significantly using machine learning. For example, we want to calculate the vibrational spectroscopy, basically how atoms shake if they're bonded to each other. That actually is a very involved, very expensive computation if we want to do it from scratch. But we train machine learning models to replace that calculation with something much faster. Another aspect is the evaluation of the energy of the system. Even though, starting with quantum mechanics, we can calculate any arrangement of atoms and have a fairly good idea of the energy. That is also very expensive. So what we do is to use machine learning models that act as a surrogate. They replace these quantum mechanical calculations and they make much faster predictions. Another aspect is in the structure sampling. Given the infinite possibilities of placing this, you know, n number of atoms, can we actually learn what part of the arrangements are actually more likely than others? So we train models to make predictions on what are more likely configuration or atom arrangement than others so that we can speed up this search. You know we're looking for essentially needles in a haystack and the faster and the smarter we can screen through this haystack, the better we'll be able to get our results. SH: And how well does machine learning help identify these needles in a haystack? MC: I would say that all the automation and automated decision making, the parallel computing together with machine learning has really accelerated how we can do this task of figuring out where the atoms are. For example, we had a case back in 19, 19 (laughs), 2019 not that far back. 2019, It took us a couple months with a researcher, postdoctoral researchers effort to figure out this particular arrangement. It was this really interesting new structure that was found. And now with our workflow, all the codes that we've developed and the methods that we've tested, we send this calculation to a supercomputer and the results came back in about two or three days, most of the time being spent on waiting for the calculation to start. The key time saving is really in the fact that humans are no longer relied upon to make the decisions in this whole process so that not only can we have the results faster, we can have more of them. And also sometimes our biases don't have to play a role in it. I think a lot of times I look at the results that come out of this and I realize that there are certain things that I wouldn't choose, but actually the data points us to: how different atoms are bonded, how many of this type of atom binds with this other type of atom. I may have some certain biases that could be right most of the time, but it may not be an absolute rule. And if we allow the data to inform the results that can sometimes give us, you know, less bias and more correct, so to speak, results. (ad music) SH: If you're enjoying the discussions we're having on this show, there's another University of Chicago podcast network show you should check out. It's called Big Brains. Big Brains brings you the stories behind the pivotal scientific breakthroughs and research that are reshaping our world. Change how you see the world and keep up with the latest academic thinking with Big brains. Part of the University of Chicago Podcast Network. (Ad music ends) SH: Since I asked some of the ways in which the other techniques were imperfect, I do feel like I should probably ask, in what ways can these machine learning techniques also introduce some issues? MC: Oh, absolutely. There's a lot of ways we have to be very careful when using machine learning and artificial intelligence. So in terms of machine learning models, the training data really governs how your model performs. So there are times that we were accidentally looking at results that don't really make sense and then it has to do with the way that we have arranged our training data. Sometimes if the model is trained on specific chemical space, some chemistry but not others, the application towards the other chemistry may not be very reliable. In general, I think this whole paradigm sort of requires us to be content with the fact that sometimes there isn't one answer. Sometimes there is a possible, a few possibilities or a multitude of possibilities where they are more or less equally good. Some answers are better in one way and some answers are better in another way. We use the concept of Pareto optimization, which is a very important mathematical concept and it's something that we actually do with intuitively. SH: Sorry to jump in here but I just wanted to give a bit more information about Pareto Optimization. Also known as Multi-objective optimization, Pareto optimization is an area of mathematics that is focused on the ways in which you can optimize a system when there’s more than one thing that’s trying to be optimized. This can lead to all sorts of interesting problems where improving the result for one area leads to a much worse result in multiple other areas and problems where there is no single solution. In other words you can think of Pareto Optimization as the mathematics of optimization compromise. MC: And we will come up with solutions and we know that some of the solutions are better conforming to the observation and some of the solutions are better conforming to the physical principles. And there will be a number of solutions that are actually more Pareto optimal and make a good compromise. And this is going to be like an important aspect because whenever you deal with statistical methods like machine learning, you will have statistical uncertainty. SH: One of the things we really like to talk about in Carry the Two is how this amazing research, be it fully theoretical, be it applied theoretical things, end up in the real world. So I was wondering, is there any way that these techniques could help and say something like material design? MC: Oh, absolutely. We are actively working on using these methods more for material understanding, which is part of material design. For example, recently we have realized that the reason that some battery materials are more likely to be unstable and catch on fire has to do with the stability of some atoms at the interfaces. But how do we know how the atoms behave at the interfaces? First, we have to know how the atoms are arranged at these interfaces. So we use our approach to determine the structure of atoms, how they arrange at an interface. And then we do computational experiments by looking at these atoms at interface: are they more stable or less stable, more likely to come out? And it turns out the oxygen atoms are much more likely to come out at this interface. And when you have oxygen coming out in a battery material, typically they have organic electrolyte. So oxygen and oil really just gives you fire. So in this work, which was published in Nature Energy, we were able to sort of come up with a mechanism that is confirmed through experiment of how these battery materials fail and understanding how they fail is a step towards designing battle-ons. So in general, we are using this approach in many different of such aspects, both in terms of solar cells and in terms of batteries, those are our primary projects, but also in other areas such as microelectronics, quantum information science and catalysis to try to both understand and design materials. SH: What do you think some future directions might this work take? MC: That's a great question. Some work that is very exciting is the integration of all of this with autonomous synthesis. The, the idea is that there are a lot of possible materials and they all do, you know, some of them do better things than others, but being able to go through the process of trying different materials and then measuring, first of all, how the atoms are arranged or what material did you actually make and then measuring their properties, like, you know, how well would they act as an ion conductor for batteries, for example. And then doing all of that together with computation, basically replacing an enterprise of scientific researcher with a combination of robots and computers. It sounds insane. It sounds too futuristic, but it really is, you know, to some extent, something that we will end up doing. You know, I think it's not going to be overnight. We're not going to all go home and have robots overlord take over our labs, but I think bits and pieces will start automating things, having computerized decisions, having robotic physical maneuvers, and all of this relies on accurate feedback and algorithms. So I see that the work that we're doing, being a part of this whole revolution towards autonomous laboratories in terms of being able to provide feedback and really ascertain what it is that you are making, what it is that's happening when you're using a material so that that informs your next step or new things to make. SH: Thank you for sharing that. And I have one last question for you. And that relates to how I know about your work. I know about your work because you came to IMSI to the Institute for Statistical and Mathematical Innovation and gave a talk here at one of our workshops. And I was wondering has that talk as being a part of workshops at IMSI had any, you know, positive impact for you and your work? MC: Absolutely. I learned so much from the other researchers who presented their work. Actually, in one of my other projects that I didn't talk about today, I took inspiration from some of the, one of the speakers and we read up on his work and my postdoc has been, you know, working on an area that's highly related. In addition, another participant pointed out that they're working on something that's related in terms of determining how atoms are arranged from these scanning microscopy data and they share their work, which was really useful because I work a lot with practical physicists, material scientists, and chemists. But oftentimes I have these mathematical curiosity, such as, you know, is it always possible to find a solution? And how do I know if I found a unique solution, you know, in some of these cases where it's possible for the solution to be unique? Like, how do I prove that? And they seem to have done some work on this area that is very relevant. So I felt like it was very satisfactory that I've come across them through the IMSI workshop. SH: Maria, thank you so much for joining me. It was a pleasure learning more about your work. MC: Thank you so much, Sam. (Outro music) SH: As always, don’t forget to check out our show notes in the podcast description for more about Maria’s research and to watch her talk. And if you like the podcast, please make sure to follow, subscribe, or like the show in your podcast app of choice It is the best way to make sure you see all of the episodes of Carry the Two And for more on the math research being shared at IMSI, be sure to check us out online at our homepage: IMSI dot institute. We’re also on twitter at IMSI underscore institute, as well as instagram at IMSI dot institute! Do you have any mathematical or statistical questions? Maybe you have an idea for a story on how mathematics and statistics connects with the world around us. Send us an email with your idea! You can send these ideas, your feedback, and more to sam AT IMSI dot institute. That’s S A M at I M S I dot institute. We also want to thank Blue Dot Sessions for our Music. Lastly, Carry the Two is made possible by the Institute for Mathematical and Statistical Innovation, located on the gorgeous campus of the University of Chicago. We are supported by the National Science Foundation and the University of Chicago.

Other Episodes

Episode 0

June 06, 2023 00:25:06
Episode Cover

Angel Hsu on Urbanization and Climate Change

Researchers become interested in their fields through all sorts of unique paths. Today’s guest, Angel Hsu of University of North Carolina Chapel Hill, came...

Listen

Episode

January 31, 2023 00:41:27
Episode Cover

Allyson Ettinger on GPT-3

How can a teacher know if a student actually wrote their book report, or if a computer did it? Are AI writers coming for...

Listen

Episode 0

May 09, 2023 00:24:42
Episode Cover

Maike Sonnewald on Modeling Oceanic Currents

Welcome to the first episode of Carry the Two’s collaboration with the American Geophysical Union’s Third Pod from the Sun! In this episode, we...

Listen