Skip to main content
Open Access Publications from the University of California

Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning

  • Author(s): Kim, Edward
  • Huang, Kevin
  • Saunders, Adam
  • McCallum, Andrew
  • Ceder, Gerbrand
  • Olivetti, Elsa
  • et al.

© 2017 American Chemical Society. In the past several years, Materials Genome Initiative (MGI) efforts have produced myriad examples of computationally designed materials in the fields of energy storage, catalysis, thermoelectrics, and hydrogen storage as well as large data resources that are used to screen for potentially transformative compounds. The bottleneck in high-Throughput materials design has thus shifted to materials synthesis, which motivates our development of a methodology to automatically compile materials synthesis parameters across tens of thousands of scholarly publications using natural language processing techniques. To demonstrate our framework's capabilities, we examine the synthesis conditions for various metal oxides across more than 12 thousand manuscripts. We then apply machine learning methods to predict the critical parameters needed to synthesize titania nanotubes via hydrothermal methods and verify this result against known mechanisms. Finally, we demonstrate the capacity for transfer learning by using machine learning models to predict synthesis outcomes on materials systems not included in the training set and thereby outperform heuristic strategies.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
Current View