In the past decades, theoretical calculations of materials properties have become more accurate and accessible due to the successful development of ab initio codes, as well as advances in computational power. With the booming development of high-throughput computational materials repositories, opportunities have emerged in the area of data-driven discovery of new materials guided by machine learning. However, the interpretation of large materials data sets needs to be performed from an integrated perspective of statistics and materials science intuition. In this thesis, we will address this challenge by demonstrating how the integration of high-throughput software workflows, automated data generation, and machine learning can yield powerful new approaches to materials analysis and optimization. This thesis is broadly divided into two topics.
In the first topic (Chapters 2 and 3), we present comprehensive first-principle investigations of the effect of transition metal mixing on layered P2 oxides, using P2 NaxCo0.2Mn0.2Ti0.2Ni0.2Ru0.2O2 as model systems. Our results show that transition metal mixing significantly suppresses the formation of strongly ordered intermediates. Using ab initio molecular dynamics simulations and the climbing image nudged elastic band method, we reveal that transition metal substitution has a pronounced effect on the Na site occupancy energy and Na diffusion energy barriers. By employing a site percolation model, we derive theoretical upper and lower bounds on the concentration of transition metal species in the layered P2 oxides based on their effects on Na diffusion energy barriers. Another key innovation is the use of the MatErials Graph Network (MEGNet) model, a graph-based deep learning approach recently developed in our group, on layered P2 oxides for accurate energy prediction, which we will apply to study mixing energies in a “high-entropy” P2 NaxCo0.2Mn0.2Ti0.2Ni0.2Ru0.2O2.
In the second topic (Chapters 4, 5, and 6), we present the development of a first-of-its-kind computational reference XAS database (XASdb). More importantly, we have developed a novel Ensemble-Learned Spectra IdEntification (ELSIE) algorithm that leverages on ensemble learning techniques to match an unknown target K-edge XANES spectra with computed spectra in XASdb. We will also discuss the development of general machine learning approaches to rapidly and efficiently identify the coordination environment of absorbing atoms from K-edge XANES.