ACED: Accelerated Computational Electrochemical systems Discovery
Rachel C. Kurchin, Eric Muckley, Lance Kavalsky, Vinay Hegde, Dhairya Gandhi, Xiaoyu Sun, Matthew Johnson, Alan Edelman, James Saal, Christopher Vincent Rackauckas, Bryce Meredig, Viral Shah, Venkatasubramanian Viswanathan
AACED: A
CCELERATED C OMPUTATIONAL E LECTRO - CHEMICAL SYSTEMS D ISCOVERY
Rachel C. Kurchin, Lance Kavalsky, Xiaoyu Sun, Venkatasubramanian Viswanathan
Carnegie Mellon UniversityPittsburgh, PA 15213, USA { rkurchin,lkavalsk,seansun,venkatv } @andrew.cmu.edu Eric Muckley, Vinay Hegde, James Saal & Bryce Meredig
Citrine InformaticsRedwood City, CA 94063, USA { emuckley, vhegde, jsaal, bryce } @citrine.io Dhairya Gandhi & Viral Shah
Julia ComputingCambridge, MA 02139, USA { dhairya, viral } @juliacomputing.com Matthew Johnson, Alan Edelman, & Christopher Vincent Rackauckas
Massachusetts Institute of TechnologyCambridge, MA 02139, USA { mattsj, edelman, crackauc } @mit.edu A BSTRACT
Large-scale electrification is vital to addressing the climate crisis, but many en-gineering challenges remain to fully electrifying both the chemical industry andtransportation. In both of these areas, new electrochemical materials and systemswill be critical, but developing these systems currently relies heavily on compu-tationally expensive first-principles simulations as well as human-time-intensiveexperimental trial and error. We propose to develop an automated workflow thataccelerates these computational steps by introducing both automated error han-dling in generating the first-principles training data as well as physics-informedmachine learning surrogates to further reduce computational cost. It will alsohave the capacity to include automated experiments “in the loop” in order to dra-matically accelerate the overall materials discovery pipeline.
LECTROCHEMISTRY AND C LIMATE C HANGE
Electrification of virtually every energy-consuming sector is critical in the fight against climatechange, as it will enable society to rely on carbon-free energy sources such as solar and wind.Increased performance and reduced cost of electrochemical technologies will be key to this elec-trification process. Many are already familiar with phenomena such as “range anxiety”, as well assticker shock for electric vehicles. Perhaps less familiar are the large swaths of the chemical industrythat rely on extreme conditions (heat and pressure) produced by burning fossil fuels for large-scalesynthesis of chemicals that are essential for fertilizers, steel, cement and other aspects of modernlife that many of us take for granted.To meet the technoeconomic targets posed by these challenges, novel materials and systems willneed to be designed. In this project, we propose to develop a generalizable and automated workflowfor discovering and developing these materials and systems. It will have the capability to build thenecessary first-principles models and use their results to train machine learning surrogates that canbe evaluated orders magnitude more quickly. The results of these models can be used in a series of1 a r X i v : . [ c ond - m a t . m t r l - s c i ] N ov oarser models to bridge from the atomic to the device scales and evaluate performance potentialin a real system, including the possibility for an “in-the-loop” automated experimental evaluation.Using sequential learning, we can then select the next candidate from a specified design space andproceed through the workflow as many times as necessary to meet target performance specifications. ROPOSED W ORKFLOW
VERVIEW
Figure 1: ML-aided materials discovery workflow. See Section 2.1 for a detailed description.The workflow we envision is summarized at a high level in Figure 1. The first step (upper left) isto generate relevant atomic structures for the candidate molecules/materials. Initially, these will befed into a first-principles simulation engine such as density functional theory (DFT) or a quantumchemistry code. These calculations result in predicted materials properties that serve as parametersin a larger-scale model (such as one describing chemical kinetics of catalysis or the operation of abattery), from which device performance can be predicted. If the workflow is purely computational,this result will determine if the search has succeeded or whether a new candidate needs to be identi-fied. If an automated experiment is “in the loop,” then this serves as a decision point for whether anexperiment should be done. Assuming performance criteria have not yet been met, a new candidatematerial/molecule is selected using a sequential learning algorithm, and structures are generated toproceed through the loop again.Once enough first-principles simulations have been run, the resulting data can be used to train an MLsurrogate model to speed up candidate evaluation. A separate surrogate can be trained to acceleratethe device modeling step.In addition to building these ML surrogates, we are also placing emphasis on automating any stepin this workflow that currently requires human intervention, with the ultimate goal of the entire ma-terials discovery loop being autonomous. For example, running forward models (orange boxes inFigure 1) often requires, in addition to substantial computational resources, multiple steps of humandecision-making, either to determine which specific calculations to do, or to resolve convergenceerrors arising from parameter choices. A significant portion of this project is developing an auto-mated DFT workflow that can proceed entirely without this type of human input and hence eliminateassociated delays.The machine learning models for this work (with the exception of the design space search describedbelow, for which a large body of code in Python already existed) take advantage of the Julia program-ming language’s (Bezanson et al., 2017) unique combination of ease of use and best-in-class perfor-mance, as well as the existence of a robust language-wide automatic differentiation system (Innes,2019). In particular, we use the Flux machine learning library (Innes, 2018) and the DifferentialE-quations.jl package (Rackauckas & Nie, 2017) extensively in this work.While the strict data-dependencies of the workflow would traditionally require the all steps of thesimulation process to be done sequentially, we are breaking the flow by training ML surrogates ofthe microkinetic models (MKM) during the first-principles simulations. This allows us to amortizethe training time during the previous step of the process and thus receive the benefits of the MLaugmentation while masking its cost. We plan to develop a surrogate of the whole MKM analysisprocess, i.e. a surrogate from the materials properties directly to the device performance prediction,so that the moment the DFT calculations are completed the neural network will bypass the stiffill-conditioned kinetic simulation and directly predict the outcomes.Search of the design space and candidate selection will be carried out using Citrine’s Citrinationcloud-based machine learning engine. Communication with Citrination will primarily be executedusing the Citrination Python API (Citrine). Modeling of the design space will be performed usinglolo (Citrine Informatics), Citrine’s custom random forest regression algorithm which incorporatesuncertainty estimates. Material candidates will be chosen based a strategy which balances greedyand exploratory selection strategies including maximum likelihood of improvement (MLI), maxi-mum expected improvement (MEI), and maximum uncertainty (MU) (Ling et al., 2017). This pro-cessed will be repeated for each material candidate of interest to enable iterative sequential learningof material properties across the design space.
ASE S TUDIES
HASE
I: E
LECTROCHEMICAL N ITROGEN R EDUCTION
The nitrogen reduction reaction (NRR) is central to global food supply as it produces ammonia, acritical component in fertilizers. Roughly 80% of the nitrogen in an average human body todayhas been produced through the Haber-Bosch process (Howarth, 2008), the state-of-the-art industrialmethod for NRR. However, this process takes place at extreme pressure ( ∼ o C), conditions that require burning fossil fuels to achieve.A promising alternative approach is electrochemical nitrogen reduction, where the activation en-ergy currently provided by heat and pressure is instead supplied by electric voltage. This approachcurrently faces obstacles due to low activity and selectivity of catalysts. In this work, we will inves-tigate two promising catalyst design spaces with potential to surmount these obstacles: single-atomcatalysts and multi-principal-element alloys.To substitute for DFT calculations to obtain binding energies for NRR intermediates with our candi-date catalysts, we make use of atomic graph convolutional neural nets. This concept was originallypopularized by Xie & Grossman (2018) for bulk crystals, and we are developing the AtomicGraph-Nets.jl package (currently available on GitHub, eventually in the Julia Package Registry) as a flexibleimplementation for crystals and molecules. In support of this, we are also developing ChemistryFea-turization.jl to provide a unified interface for building graphs from a variety of input structure files,and assigning feature matrices using data from several online databases.3.2 P
HASE
II: N
OVEL L I B ATTERY E LECTROLYTES
Achieving higher specific energy (i.e. energy per unit mass) and power in batteries, as well as higherenergy and power density (per volume), is critical to further expanding electric transportation (Sripad& Viswanathan, 2017) as well as to eventually electrify flight (Fredericks et al., 2018). A promisingway to achieve these targets is by shifting to metallic lithium as the anode. Removing the typicalgraphite anode reduces both weight and volume and increases voltage and power capabilities, butintroduces challenges due to the tendency of lithium to form dendrites during charging. Thesedendrites can reach across the separator and short the device, reducing cycle life, and in the worstcases, cause dangerous fires.Critical to ameliorating these issues is the development of novel battery electrolytes that can blockdendrites from growing. Typically, screening of candidate electrolyte molecules makes use of com-putationally intensive quantum chemistry simulations. However, there are several promising ML3pproaches to building fast and accurate surrogates for such calculations. We are in the process ofbuilding the Julia language port of the popular DeepChem machine learning package (Ramsundaret al.). These property prediction methods will be used in conjunction with pseudo 2-dimensionalporous electrode models for predicting the performance of lithium metal batteries.
ONCLUSION
Rapid materials discovery is critical in variety of climate change challenges, including and especiallyelectrifying the chemical and transportation industries. Given the urgency of the climate challenge,we no longer have the luxury of time to go about materials discovery in the “traditional” paradigm.Machine learning approaches along with automation of both simulation and experimentation havegreat potential to dramatically accelerate the cycle of learning and help us to discover and developthe new materials and systems that will be essential in engineering a green future.A
CKNOWLEDGMENTS
The information, data, or work presented herein was funded in part by the Advanced Re-search Projects Agency-Energy (ARPA-E), U.S. Department of Energy, under Award Number DE-AR0001211. The views and opinions of authors expressed herein do not necessarily state or reflectthose of the United States Government or any agency thereof. R EFERENCES
Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B Shah. Julia: A fresh approach to nu-merical computing.
SIAM review , 59(1):65–98, 2017. URL https://doi.org/10.1137/141000671 .Citrine. Citrination Python API. https://github.com/CitrineInformatics/python-citrination-client .Citrine Informatics. lolo machine learning library. https://github.com/CitrineInformatics/lolo .William L Fredericks, Shashank Sripad, Geoffrey C Bower, and Venkatasubramanian Viswanathan.Performance metrics required of next-generation batteries to electrify vertical takeoff and landing(VTOL) aircraft.
ACS Energy Letters , 3(12):2989–2994, 2018.Robert W Howarth. Coastal nitrogen pollution: a review of sources and trends globally and region-ally.
Harmful algae , 8(1):14–20, 2008.Michael Innes. Don’t unroll adjoint: differentiating SSA-form programs. arXiv:1810.07951 [cs] ,March 2019. URL http://arxiv.org/abs/1810.07951 . arXiv: 1810.07951.Mike Innes. Flux: Elegant machine learning with Julia.
Journal of Open Source Software , 3(25):602, 2018. ISSN 2475-9066. doi: 10.21105/joss.00602.Julia Ling, Maxwell Hutchinson, Erin Antono, Sean Paradiso, and Bryce Meredig. High-dimensional materials and process optimization using data-driven experimental design with well-calibrated uncertainty estimates.
Integrating Materials and Manufacturing Innovation , 6(3):207–217, 2017.Christopher Rackauckas and Qing Nie. DifferentialEquations.jl – a performant and feature-richecosystem for solving differential equations in Julia.
Journal of Open Research Software , 5(1),2017.B Ramsundar, P Eastman, E Feinberg, J Gomes, K Leswing, A Pappu, M Wu, and V Pande.Deepchem: Democratizing deep-learning for drug discovery, quantum chemistry, materials sci-ence and biology.Shashank Sripad and Venkatasubramanian Viswanathan. Performance metrics required of next-generation batteries to make a practical electric semi truck.
ACS Energy Letters , 2(7):1669–1673,2017. 4ian Xie and Jeffrey C. Grossman. Crystal graph convolutional neural networks for an accurate andinterpretable prediction of material properties.