- Teixeira, João MC;
- Liu, Zi Hao;
- Namini, Ashley;
- Li, Jie;
- Vernon, Robert M;
- Krzeminski, Mickaël;
- Shamandy, Alaa A;
- Zhang, Oufan;
- Haghighatlari, Mojtaba;
- Yu, Lei;
- Head-Gordon, Teresa;
- Forman-Kay, Julie D
The power of structural information for informing biological mechanisms is clear for stable folded macromolecules, but similar structure-function insight is more difficult to obtain for highly dynamic systems such as intrinsically disordered proteins (IDPs) which must be described as structural ensembles. Here, we present IDPConformerGenerator, a flexible, modular open-source software platform for generating large and diverse ensembles of disordered protein states that builds conformers that obey geometric, steric, and other physical restraints on the input sequence. IDPConformerGenerator samples backbone phi (φ), psi (ψ), and omega (ω) torsion angles of relevant sequence fragments from loops and secondary structure elements extracted from folded protein structures in the RCSB Protein Data Bank and builds side chains from robust Monte Carlo algorithms using expanded rotamer libraries. IDPConformerGenerator has many user-defined options enabling variable fractional sampling of secondary structures, supports Bayesian models for assessing the agreement of IDP ensembles for consistency with experimental data, and introduces a machine learning approach to transform between internal and Cartesian coordinates with reduced error. IDPConformerGenerator will facilitate the characterization of disordered proteins to ultimately provide structural insights into these states that have key biological functions.