Combining text mining, in situ characterization, and ab initio calculations to rationalize BiFeO3 crystallization pathways

: The combination of three highly complementary scientific domains is demonstrated to rationalize bismuth ferrite (BiFeO 3 ) [BFO] crystallization pathways: text mining to extract processing recipes from existing literature, in situ X-ray scattering


Introduction
The advancement of materials research requires a renewed focus on accelerating materials development and discovery, moving away from a reliance on empirical studies. 1 To achieve this goal, a paradigm shift is needed in which computational analysis will play an important role.There is now the opportunity to contribute to the early development and implementation of new integrated experimental, computational, and data informatics tools. 1,2One complex problem of high interest requiring the integration of complementary scientific domains involves text mining to design thin film fabrication experiments, followed by in situ characterization of that fabrication, and the rational explanation of observed crystallization pathways.
The material system chosen here to develop and validate this approach is bismuth ferrite (BiFeO 3 , BFO).BFO was chosen based on a cursory search over our literature database for oxide papers discussing impurity phases appearing in fully processed thin films, as it was the most datarich having the highest number of papers returned (followed by SrTiO 3 and LiFePO 4 ) and can be processed from solutions.BFO has gained attention as a room-temperature multiferroic, 3,4 is environmentally friendly (as compared to lead-based Pb(Zr,Ti)O 3 material), and is considered one of the most promising materials in next-generation non-toxic ferroelectric memory and spintronic devices. 5BFO exhibits a rhombohedral distorted perovskite crystal structure with the space group R3c at room temperature. 6,7Despite its promise, the formation of impurity phases in synthesizing BFO significantly jeopardizes its ferroelectric properties due to the leakage currents in the film.This detriment makes it essential to synthesize phase-pure polycrystalline BFO films.Bismuth-rich (Bi 25 FeO 40 ) or iron-rich (Bi 2 Fe 4 O 9 ) phases are the most commonly observed impurities as their formation is thermodynamically competitive with BFO formation when starting from typical precursors.In order to control phase purity via targeted processing and precursor choice, an understanding of the crystallization pathway is necessary, but studies on the crystallization pathway of BFO thin films are rare. 8It is noted that impurity phases are not the only metric to define film quality but was chosen here as a starting point for methodology development. 9in films are the most common material form for device applications in the semiconductor industry. 10,11Among various techniques to fabricate thin films (e.g., RF-sputtering, molecular beam epitaxy, pulsed laser deposition, magnetron sputtering, chemical solution deposition, chemical vapor deposition, and sol-gel processing), [11][12][13][14][15][16][17] sol-gel processing is often prioritized due to lower processing temperatures and more precise compositional control, allowing the fabrication of cost-effective, largearea, and high-purity thin films. 10,11In sol-gel processing, crystallization pathways, final film microstructure, and properties can be tuned by solution engineering (e.g., using pure or mixed solvents, adding chelating agents) [18][19][20] and experimental conditions (e.g., temperature, time, number of iterations), but the causal relationship has been observed only partially due to limitations of conventional approaches.
To establish the link between sol-gel processing parameters and BFO crystallization pathways, we developed and validated an integrated approach combining text mining, in situ X-ray scattering, and ab initio calculations (Figure 1).First, we used text mining to search existing research publications and create structured datasets to design efficient experiments based on available scientific knowledge. 9Next, we conducted in situ X-ray diffraction (XRD) experiments to follow the crystallization pathway, observe the formation of impurity phases, and compare these against reported impurity phases extracted from literature.Last, we used density functional theory calculations to rationalize observed crystallization pathways.In summary, text mining indicated the prevalence of experimental decisions for various processing parameters as well as the correlation between processing temperatures and impurity phase formation.In situ XRD revealed the metastable formation of the intermediate bismutite phase (Bi 2 O 2 CO 3 ).Ab initio calculations found that the presence of this Bi 2 O 2 CO 3 phase increases the thermodynamic driving force to form BFO over Bi 2 Fe 4 O 9 over a large temperature window.It is noted that impurity phases are not the only metric to define film quality but was chosen here as a starting point for methodology development. 9We use the information provided by the three techniques to design efficient experiments, refine our understanding of the BFO crystallization pathway, and provide rational guidelines for precise control of BFO crystallization.

2.1Text mining of BFO fabrication
Text mining was employed in this study to provide a comprehensive picture of the existing variable space explored for sol-gel processed BFO Although key information from a limited number of publications can be extracted by humans, the dataset was manually parsed and annotated to pursue two goals: first, to create a 'gold' standard to train machine learning models that predict phase purity from processing conditions, and second, to guide the planning of experimentalists (K.

Cruse et al., manuscript in preparation)
. For the latter, with data analysis, we revealed main thin film fabrication steps, common processing values, underexplored processing variables, and designed new experiments.
Thus, text-mining provided the opportunity to process a notably bigger scope of data than is commonly treated by a human researcher, increasing the confidence about derived conclusions and allowing us to study all aspects of BFO sol-gel fabrication in detail.
A typical sol-gel fabrication of BFO thin films is schematically depicted in Figure S1 and involves the liquid precursor preparation, followed by spin-coating, baking, and annealing.Figure 2 illustrates a visual summary of the different aspects of the BFO processing space as reflected in this dataset, providing a comprehensive overview of typically reported impurity phases and their frequency, solvents, chelating agents, and mixtures thereof used, and lastly temperatures used during different processing stages.While various metal precursors have been used for BFO fabrication (Figure 2a), text mining shows that iron nitrate nonahydrate [Fe(NO 3 ) 3 •9H 2 O] combined with bismuth nitrate pentahydrate [Bi(NO 3 ) 3 •5H 2 O] comprise the overwhelming majority of recipes (318 in total, 95% of the dataset; extreme is not included in heat colors in Figure 2a to emphasize differences in lower proportion combinations).Although Bi and Fe are stoichiometric in the BiFeO 3 crystal structure, different Bi:Fe ratios in the precursor solution were reported in the literature (Figure S2a).A Bi:Fe ratio of 1.05:1 was the most frequently used, mainly as an attempt to compensate for Bi volatility during annealing.A wide range of solvents (Figure S2b) has been employed to synthesize BFO thin films using the sol-gel method, including 2-methoxyethanol (2ME), ethylene glycol (EG), 2-ethoxyethanol, ethanol, dimethylformamide, and propionic acid.Text mining reveals that 2ME and EG with 248 and 72 cases, respectively, and their mixtures (19 cases) are most frequently used to fabricate BFO films.The use of chelating agents has been adopted to increase the viscosity and drying time of the gel in order to obtain highquality and uniform films.Several chelating agents, including acetic acid, citric acid, nitric acid, acetylacetone, and ammonium hydroxide, were used, with acetic and citric acids combined with other solvents (114 and 60 cases) having the largest share (Venn diagrams Figure 2b-c, and   S2c).2ME is most frequently combined with acetic acid (84 cases) or citric acid (34 cases).Dehydrating agents and surfactants, such as acetic anhydride and ethanolamine, are also common reagents in this processing space, though they are not included in this analysis.There are only a few papers that study the effects of varying the combination of precursor solution agents across experiments. 21,22 While no impurity phases were observed in the citric route (i.e.citratebased solutions using Bi 2 O 3 , nitric acid, iron(III) citrate precursors). 21The study by Liang et al. reported no impurity phases for both, nitric acid and mixed 2ME and glacial acetic acid solutions. 22To this end, little is known about the effects of solvent engineering solely or combined with chelating agents on the BFO crystallization pathway.Text mining showed that roughly a quarter of the reported syntheses result in phase impurities as depicted in Figure 2d together with the most common impurity phases and their formulae.Bi 2 Fe 4 O 9 , binary oxides Bi 2 O 3 and Fe 2 O 3 , and bismuthrich impurities make up a large portion of the possible impurity phases formed.Figure S2d shows the prevalence of annealing environments, with air being the most common.The oxygen atmosphere is also used on occasion to mitigate the development of oxygen vacancies during processing.
According to our text mining analysis, the annealing temperature is another important processing variable.The following describes the purpose of various heat treatment steps throughout a typical BFO fabrication (see schematic Figure S1).After spin-coating, individual wet films are dried at ~100-250 °C to evaporate the solvents and induce gelation in the as-spin-coated films.Then, the films are pre-baked at ~200-400 °C to induce the thermal decomposition of organic materials.
Up to this stage, the films are mostly amorphous.Lastly, the films are annealed at relatively higher temperatures of ~400-800 °C to induce nucleation, growth, and conversion of the amorphous to a polycrystalline film.These steps are typically administered after each of several layers are spin-coated ("layer-by-layer") until the desired film thickness is reached, after which a final annealing step is often employed.Using text mining, we studied the distribution of layer-by-layer and final annealing temperatures and the relationship between temperature and impurity phases, which is summarized in Figure 2e.The typical crystallization window for sol-gel-derived BFO thin films is found to be between ~500 °C and ~600 °C.There is a higher probability of impurity phase formation with a relatively higher temperature than the average annealing temperature.For Instance, processing procedures with annealing at the most common temperature (~ 550 °C) result in impurities in 6% and 14% of cases for layer-by-layer and final annealing, respectively.With the increase of annealing temperatures to 600 °C, the proportion of syntheses that lead to impurities changes to 4% and 36% for layer-by-layer and final annealing, respectively.At 700 °C, this proportion increases to 100% for layer-by-layer (our dataset has records of only phase-impure films when annealing at this temperature) and 80% for final annealing.

In situ X-ray diffraction during BFO annealing:
With the input from text mining to design efficient experiments for the BFO film fabrication, we investigated the role of solvents, chelating agents, and metal ratios on the crystallization pathway via in situ XRD during thermal annealing.The fabrication of BFO films is described in detail in the experimental section.In brief, Bi(NO 3 ) 3 •5H 2 O and Fe(NO 3 ) 3 •9H 2 O precursor salts were dissolved in 2ME, EG, or mixtures thereof.Some recipes include chelating agents (citric acid or acetic acid).
The final film is built from 5 cycles of spin coating and pre-baking at 300 °C.The growth of nuclei in BFO films was reported to occur via oriented attachment, meaning the coalescence of similarly oriented crystals. 8Our in situ results show that the bismutite phase appearance correlates with the onset of BFO crystallization (Figure S3).Several studies discussed the occurrence and role of intermediate carbonate phases in perovskite-type oxides. 23,24For example, it was previously suggested that in BFO processing with acetic acid, bismuth acetate, iron acetylacetonate, and water, Bi 2 O 2 CO 3 crystallizes first in the amorphous film at the preexisting metal-organic clusters before BFO formation. 8The combination of acetic acid with EG or 2ME or citric acid with EG in our study with bismuth and iron nitrates, however, did not result in the bismutite phase or any other intermediate phases (Figures S3, S4).Thus, we speculate that the carboxylic group in citric acid contributes to the formation of Bi 2 O 2 CO 3 in the presence of 2ME.
It was found in this study, that the onset of BFO crystallization is affected by the choice of solvents, chelating agents, and the metal ratio of the precursor solution.The overall trend for all conditions is that a lower onset crystallization temperature occurs with a higher Bi content in the precursor solution.For instance, stoichiometry change from Bi-deficient to Bi-excess lowers the onset of BFO crystallization by 25 -50 °C.For films prepared using pure 2ME, the onset of BFO crystallization is reduced from 550 °C for Bi-deficient samples (Bi:Fe = 0.9) to 525 °C for both stoichiometric (Bi:Fe = 1.0) and Bi-excess (Bi:Fe = 1.1) samples (Figure 4).Films prepared from pure EG versus pure 2ME show a lower BFO crystallization onset.For instance, for films prepared from pure EG, the onset of crystallization is reduced from 475 °C for Bi-deficient samples (Bi:Fe = 0.9) to 450 °C for both stoichiometric (Bi:Fe = 1.0) and Bi-excess (Bi:Fe = 1.1) samples (Figure 4 and Table S1).Higher Bi content seems to facilitate BFO growth, 8 possibly related to the nature of the pre-existing metal-organic clusters and their transformation to the rhombohedral distorted perovskite structure.XRD demonstrates the importance of our investigative approach that integrates detailed, time-resolved phase identification for the fabrication of our system along with historical collection of processing data.

2.3Thermodynamics of BFO vs impurity phases
As the last scientific domain deployed here, density functional theory (DFT) calculations with the meta-GGA SCAN density functional were used to assess the thermodynamic stability of BFO with respect to decomposition into binary and ternary competing phases. 25A recently introduced model for finite temperature thermodynamics was used to map DFT-calculated formation energies at 0 K to temperature-dependent Gibbs formation energies, ΔG f (T). 26In Figure 5a In Figure 5b, we show that only BFO and not Bi 2 Fe 4 O 9 is thermodynamically stable (on the convex hull) 28 along this tie line at low temperature (27 °C, 300 K), but increasing the temperature to 627 °C / 900 K (Figure 5c), preferentially stabilizes Bi 2 Fe 4 O 9 compared to BFO.
While at 300 K, BFO is the only stable ternary phase along this tie line, at 900 K, BFO and Bi 2 Fe 4 O 9 are both thermodynamically stable, and there is comparable thermodynamic driving force to form either phase from the binary oxides (Fe 2 O 3 , Bi 2 O 3 ).This indicates the increasing possibility of Ferich impurity phase formation for syntheses where the sample is exposed to high temperature for extended periods.Indeed, Bi 2 Fe 4 O 9 is a common impurity phase formed during BFO fabrication as revealed by text mining results (Figure 2a) and shown in Figure 3c. 29Moreover, increasing the temperature further preferentially stabilizes Bi 2 Fe 4 O 9 relative to BFO.This finding explains both the text mining and in situ XRD results (Figure 4), where both revealed that impurities are more likely to form at relatively higher temperatures.
Although text mining did not reveal Bi 2 O 2 CO 3 as an impurity phase occurring in fully fabricated BFO thin films, it was observed as a shortlived intermediate phase during in situ XRD measurements.Highlighting the effect of this intermediate on competing reactions in this system, our DFT calculations (Figure 5d) showed that the presence of Bi 2 O 2 CO 3 lowers the free energy of the reaction, ΔG rxn , toward BiFeO 3 as compared to the reaction toward Bi 2 Fe 4 O 9 (on a per atom basis). 30That is, the relative driving force toward the desired phase, BiFeO  where Bi2Fe4O9 appears to be thermodynamically preferred over BFO as shown previously in Figure 5b-c S4d-f).
It should be noted that there are various sources of error that could influence the computed thermodynamics, including errors in the 0 K compound thermochemistry computed using DFT and in the vibrational entropy correction using the model presented in Ref. 26 .However, most of these errors in absolute formation energies are likely to cancel in the calculation of relative energies due to systematic dependence of the errors on chemical composition.This is especially true when comparing differences in reaction energies, for instance as it pertains to the relative preference for BFO over Bi 2 Fe 4 O 9 when the carbonate precursor is used.

Discussion and Conclusions
Based on the complementary input from text mining, in situ XRD, and thermodynamic modeling, we propose a comprehensive understanding of BFO crystallization pathways.Figure 6 shows an overview of the flow of information between the three domains.as well as the used of Bi-rich precursors.these years is limited in the text-mined datasets. 33These were then inspected using the same criteria as above to yield the remaining 57 articles.Manual inspection and validation were performed by two human experts in text/data mining and materials science.

Processing procedure, outcome extraction and data cleaning:
Manual extraction and validation of processing procedures and outcomes was preferred over automated methods due to the difficulty in accurately linking outcomes to respective procedures.All thin film processing procedures were extracted manually (from the same two experts mentioned above) into .csvformat, where each column represents an aspect of the film fabrication (such as stirring temperature, annealing time, etc., as well as associated metadata) and each row represents a procedure and associated outcome.For uniformity of chemical and precursor naming, we ensured that each chemical name was standardized across its various representations (e.g.formulae were changed to the corresponding chemical name and alternative spellings were adjusted).All other parameters were numerical and standardized when necessary (e.g.descriptive times such as "one day" would be changed to "24 [hours]").

Data visualization:
Visualizations to summarize the data in the textmined dataset were created using the matplotlib (https://matplotlib.org/) and seaborn (https://seaborn.pydata.org/)Python visualization modules.For films prepared with chelating agents, either citric acid (99.9%,Sigma-Aldrich) or acetic acid (99.9%,Sigma-Aldrich) was added to precursor solution with molar ratio of Acid: Fe (4:1).After complete dissolution of the precursors, the solution was spin-coated on glass substrates at 3000 rpm for 30 sec, then dried on a hot plate at 80 °C for 10 min and baked on a hot plate at 300 °C for 10 min.The spin coating/baking procedure was repeated 5 times to obtain thick films.The as-cast baked films were annealed during in situ XRD measurements.
In situ X-ray diffraction (XRD): XRD measurements were performed at wavelength (1.5406 Å) from copper K-α x-ray radiation using Rigaku Smartlab X-ray diffractometer equipped with a HyPix-3000 high-energyresolution multidimensional semiconductor detector, the temperature was increased from RT to 700 °C with an increment of 25 °C and was hold at each temperature for 5 min before doing XRD measurements (the heating/annealing profile is shown in the SI Fig. S7).Taking in situ diffraction measurements on a lab-based tool comes with the compromise of a small 2θ range versus potentially missing short-lived intermediate phases if scanning a longer 2θ range.Therefore, the in-situ measurements were conducted from 2θ = 10° -35° with a scan rate of 3°/min and a step size of 0.1° (i.e.8.3 min per pattern) to minimize the time for capturing the XRD pattern.
Computed thermodynamics: Standard Gibbs formation energies, ΔG f (T), for CO 2 was obtained from NIST. 34For solid-state compounds (i.e., BiFeO 3 and competing phases), formation enthalpies (at 0 K) were obtained with density functional theory (DFT) using the SCAN meta-GGA density functional. 25Each initial structure was obtained from Materials Project 35 or the Inorganic Crystal Structure Database 36 and optimized using the Vienna Ab Initio Simulation Package (VASP) 37 and the projector augmented wave (PAW) method. 38A plane wave energy cutoff of 520 eV and 1000 k-points per reciprocal atom were used for all calculations.
ΔG f (T) for each solid-state compound was then obtained by combining the DFT-calculated formation enthalpies with the machine-learned descriptor introduced in ref. 26 ΔG f (T) for CO 2 was obtained from NIST. 34Gibbs formation energies were then used to compute all Gibbs reaction energies presented in this work, where the reaction energy is the difference between the sum of ΔG f (T) for the products minus the sum of ΔG f (T) for the reactants, weighted by the stoichiometric coefficients in the reaction.
These reaction coefficients were determined on the basis of forming 1 mole of product (either BiFeO 3 or Bi 2 Fe 4 O 9 ).To compare the reaction energies for forming either product on even footing, the molar reaction energies were then normalized per atom in the product.

Figure 1
Figure 1 Schematic representation of the interplay between text mining, processing & in situ characterization, and thermodynamic modeling using ab initio calculations.
thin films.This helped to inform experiments and in validating mechanistic insights found through in-situ XRD characterization and ab initio calculations.Through the manual text mining of existing literature on the sol-gel fabrication of BFO thin films, we compiled and analyzed a dataset of 340 thin film processing procedures and outcomes from 178 publications.Details for the collection and extraction of these processing recipes are provided in Section 4 (Article identification for text-mined dataset and Processing procedure and outcome extraction) as well as in an accompanying manuscript.[Kevin Cruse et al. in preparation] This dataset was constructed via a search over an in-house literature database as well as Clarivate Analytics' Web of Science for papers relevant to BFO fabrication, followed by manual inspection to determine which papers contained fabrication of undoped BFO thin films through the sol-gel method.

Figure 2
Figure 2 Overview of text-mined experimental conditions employed in BiFeO 3 thin film solution syntheses published in the literature.(a) Heatmap depicting the frequency of combinations of Bi and Fe precursors.Iron precursors from left to right (without metal prefix) are iron: nitrate nonahydrate, pentethoxide, acetylacetone, citrate, pentanedionate, and acetate.Bismuth precursors from top to bottom are bismuth: nitrate pentahydrate and acetate.(b-c) Count of recipes with the most common solvents 2-methoxyethanol (2ME) and ethylene glycol (EG) and their combination with the chelating agents citric acid in and acetic acid in (b) and (c), respectively.(d) Prevalence of the formation of impurity phases across all syntheses in this dataset, along with the formulae for the most common impurity phases that form out of these syntheses.(e) Distribution of temperatures used during film annealing including information on presence/absence of impurity phases.Percentages represent the proportion of phase impure syntheses to phase pure within a 20 °C temperature window below and above the dotted line.

Figure 4 and
Figures S3 and S4 provide corresponding 2D XRD color maps.Due to a compromise between data acquisition time and the lifetime of possible intermediate phases, a relatively small 2theta range from 10-35° was chosen.Prior to in situ diffraction measurements ex situ diffraction measurements were performed on a wider 2theta range to more confidently evaluate impurity phase formation (FigureS5).The majority of observed results fall into three different categories: (i) no impurity phase, (ii) persistent impurity phase, and (iii) intermediate phase (Figure 3).Line profiles of these measurements with peak labeling are illustrated in Figure S6.The temperature profile and holding times during in situ measurements are depicted in Figure S7.The 2D color maps (Figure 3) show selected experiments with Bi:Fe ratio of 1.1, as is widely recommended in the literature to compensate for Bi loss during annealing, and varied precursor solutions: pure 2ME, mixed solvents EG:2ME (9:1), and 2ME with citric acid.Figures S3 and S4 illustrate a broader range of experiments with Bi:Fe ratios of 0.9-1.1 and precursor solutions of pure EG and acetic acid additive.Category (i), no impurity phase, corresponds to solvent systems utilizing pure 2ME or EG.The precursor does not exhibit crystalline impurity phases throughout the annealing process, directly transforming to phase pure BFO upon reaching a certain temperature inversely proportional to Bi content (Figure S3a-f).In category (ii), an impurity phase Bi 2 O 3 was observed at the pre-bake stage for mixed solvent systems EG:2ME.The Bi 2 O 3 phase persists onward and dissociates during thermal annealing at a temperature directly proportional to Bi content, as revealed by the disappearance of Bi 2 O 3 diffraction peaks

Figure 3
Figure 3 2D color maps of in situ X-ray diffraction during thermal annealing of BiFeO 3 thin films with chemistries representative of three categories (a) no impurity phase, (b) persistent impurity phase Bi 2 O 3 , and (c) intermediate phase Bi 2 O 2 CO 3 .The films were spin coated using a precursor solution with stoichiometry Bi:Fe = 1.1 with pure 2ME (a), mixed solvents EG:2ME (b), and 2ME with citric acid (c).The dotted lines mark the BFO crystallization onset.The 'shadows' before the BFO peaks appear (a,b) as well as the broad peak ~475℃ (c) are artifacts from the plotting since there are no visible peaks at these points in time at these angles (see Fig. S6).

Figure 4
Figure 4 Summary of evolving phases during thermal annealing of BFO films prepared from different solvents

,
we show the calculated Bi-Fe-O phase diagram at 627 °C (900 K), where BiFeO 3 is calculated to be thermodynamically stable with respect to decomposition into Bi 2 Fe 4 O 9 and Bi 25 FeO 39 .By overlaying the thermodynamic driving force for compound formation from the elements, ΔG f , on the ternary phase diagram, we see there may be a preference for forming Fe-rich phases as this region of the phase diagram has the largest driving force, indicated by the lighter coloring. 27Both BFO and Bi 2 Fe 4 O 9 contain Bi 3+ and Fe 3+ , meaning in the Bi-Fe-O phase diagram, they lie along the tie-line formed by the corresponding binary phases having the same nominal oxidation states Bi 2 O 3 and Fe 2 O 3 .

Figure 5d illustrates the
Figure 5d illustrates the change in free energy of reaction for BFO versus Bi 2 Fe 4 O 9 formation in the absence versus presence of the Bi 2 O 2 CO 3 phase.

Figure 5e shows the
Figure5eshows the prevalence of the formation of impurity phases (extracted by text mining) across all fabrications using 2ME, 2ME with citric acid, and 2ME without citric acid.Text mining confirms that the use of 2ME with citric acid (associated with Bi 2 O 2 CO 3 formation as in our in-situ XRD results) has a probability of 100% of obtaining pure BFO films as compared to 2ME without citric acid (67.5%).Through text mining of historic syntheses we validate our findings (i.e.phase impurity formation is mitigated by the presence of Bi 2 O 2 CO 3 from the inclusion of citric acid) from experiment and theory because all published procedures using 2ME + CA in the precursor solution resulted in a phase pure outcome.

Figure 5
Figure 5 Calculated thermodynamics in the Bi-Fe-O chemical space.(a) Ternary Bi-Fe-O phase diagram at 627 °C (900 K).Blue circles indicate stable phases.The color-bar indicates the driving force for phase formation with respect to the elemental phases (Bi, Fe, O).At stable compositions (e.g., BiFeO 3 ), the colorbar indicates the formation energy of this phase.At points in composition space that do contain a stable compound, the formation energy (colorbar) is computed relative to the combination of stable phases that lie on the convex hull and minimize the formation energy at that composition (the decomposition products).(b-c) Driving forces for phase formation with respect to the Bi 2 O 3 -Fe 2 O 3 tie line at (b) 27 °C (300 K) and at (c) 627 °C (900 K).(d) Reaction energies for the formation of BFO and Bi 2 Fe 4 O 9 from precursors without or with the Bi 2 O 2 CO 3 phase.For comparing reactions with diverse compositions, each reaction energy is normalized per mole of atoms.For clarity, the reactions in the figure legend are shown on the basis of 1 mole of ternary Bi-Fe-O phase formed.To compute the reaction energy using the kJ/(mol atom) basis, the energies for the molar reactions shown in the legend are divided by the number of atoms in the product side of each reaction (5 atoms for the reaction forming BFO, 6.5 for the reaction forming BFO and CO 2 , 15 for the reaction forming Bi 2 Fe 4 O 9 , and 18 for the reaction forming

Figure 6
Figure 6 Schematic representing an overview of the flow of information between text mining, in situ XRD, and thermodynamic modeling.
This work was funded by the U.S. Department of Energy (DOE), Office of Science, Office of Basic Energy Sciences, Materials Sciences and Engineering Division under Contract No. DE-AC02-05-CH11231 (D2S2 program KCD2S2).Work at the Molecular Foundry was supported by the Office of Science, Office of Basic Energy Sciences, of the U.S. DOE under Contract No. DE-AC02-05CH11231.K.H. acknowledges support from the National Research Foundation of Korea funded by the Ministry of Education of Korea (No. 2019R1A6C1010024 and No. 2021RIS-002) and the Ministry of Science and ICT of Korea (No. NRF-2021R1A4A1052051).This research used resources of the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility located at Lawrence Berkeley National Laboratory, operated under Contract No. DE-AC02-05CH11231 6. Author contributions Conceptualization: M.A., K.H., K.C., G.C. and C.M.S.-F.Experimentation and analysis: M.A., K.H., and C.M.S.-F.Text mining: K.C., V.B., A.J., and G.C. First-principle calculations: C.J.B., A.T. and G.C. Supervision: G.C., A.J., Qi et al., found Bi 2 Fe 4 O 9 in the The lowest crystallization onset at 450 °C was found for mixed EG:2ME solvents (category ii) with Bi 2 O 3 as a persistent impurity phase.Bi 2 O 3 dissociates at a higher temperature, the higher the Bi content.Notably, the BFO crystallization onset is independent of the Bi:Fe ratio for EG:2ME = 7:3, exhibiting the lowest crystallization onset at 450 °C for all stoichiometric compositions.Adding chelating agents (acetic or citric acid) to pure solvents results in an increase in the onset of BFO crystallization compared to films prepared from pure solvents.We hypothesize that this increase can be attributed to the formation of amorphous complexes that require higher temperatures to dissociate before allowing BFO crystallization as compared to pure EG or 2ME samples, which may not be able to coordinate with the metals in experiments.On the other hand, the formation of Bi 2 O 2 CO 3 was observed as a short-lived intermediate phase during the in situ thermal annealing experiments, but this phase is not part of the final film and thus it was not extracted via text mining.Finding this intermediate phase through in situ 3 , compared to the undesired impurity, Bi 2 Fe 4 O 9 , increases when Bi 2 O 2 CO 3 is present as an intermediate, largely because more CO 2 is evolved per atom of reaction when BFO rather than Bi 2 Fe 4 O 9 is formed from Fe 2 O 3 and Bi 2 O 2 CO 3 .This confirms the influence of this intermediate on the crystallization pathway.

.
With more confidence one can extract from these calculations that Bi 2 Fe 4 O 9 is favored compared to BFO at higher temperatures.It is pointed out that the slopes of these lines are distinguishably different.In the presence of Bi 2 O 2 CO 3 phase, however, BiFeO 3 is thermodynamically preferred to Bi 2 Fe 4 O 9 across the entire temperature range.This finding reveals that the formation of the BFO phase becomes more favorable as compared to the Fe-rich impurity phase (Bi 2 Fe 4 O 9 ) in the presence of Bi 2 O 2 CO 3 , and this finding is corroborated by the text-mined data.The experimental limit for the in situ measurements at 700°C almost coincides with the crossover point of the calculated reaction energies and the calculations may underestimate the crossover point.Lastly, BFO seems to be the kinetically preferred phase and if it starts forming at low temperatures where it is also thermodynamically preferred, formation of the Bi 2 Fe 4 O 9 phase is inhibited at higher temperatures due to a small driving force of Bi2O3 + Fe2O3→ Bi 2 Fe 4 O 9 .Consequently, the conversion rate of BFO to Bi 2 Fe 4 O 9 is slow.Experiments however show that not only the temperature and presence/absence of Bi 2 O 2 CO 3 influence impurity formation, the metal ratio also plays a role.With an intermediate Bi 2 O 2 CO 3 phase and under Bi-rich processing, Bi 2 Fe 4 O 9 forms at ~650 °C while this Fe-rich phase does not form under Bi- Text mining provided input on the most frequently used chemicals and processing conditions, identified the most commonly reported impurity phases (Bi 2 Fe 4 O 9 and Bi 2 O 3 ), and mining and ab initio calculations predict that the formation of impurities is more likely at higher temperatures, in situ XRD was used to detect the parameters that can allow for lowering the annealing temperature, such as the solvent choice (using pure EG or mixed EG:2ME)