Skip to main content
eScholarship
Open Access Publications from the University of California

Bayesian Model Averaging for Ensemble-Based Estimates of Solvation-Free Energies.

  • Author(s): Gosink, Luke J
  • Overall, Christopher C
  • Reehl, Sarah M
  • Whitney, Paul D
  • Mobley, David L
  • Baker, Nathan A
  • et al.
Abstract

This paper applies the Bayesian Model Averaging statistical ensemble technique to estimate small molecule solvation free energies. There is a wide range of methods available for predicting solvation free energies, ranging from empirical statistical models to ab initio quantum mechanical approaches. Each of these methods is based on a set of conceptual assumptions that can affect predictive accuracy and transferability. Using an iterative statistical process, we have selected and combined solvation energy estimates using an ensemble of 17 diverse methods from the fourth Statistical Assessment of Modeling of Proteins and Ligands (SAMPL) blind prediction study to form a single, aggregated solvation energy estimate. Methods that possess minimal or redundant information are pruned from the ensemble and the evaluation process repeats until aggregate predictive performance can no longer be improved. We show that this process results in a final aggregate estimate that outperforms all individual methods by reducing estimate errors by as much as 91% to 1.2 kcal mol-1 accuracy. This work provides a new approach for accurate solvation free energy prediction and lays the foundation for future work on aggregate models that can balance computational cost with prediction accuracy.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
Current View