- Souza, Lucas;
- Miller, Beck;
- Cammarota, Ryan;
- Lo, Anna;
- Lopez, Ixchel;
- Shiue, Yuan-Shin;
- Bergstrom, Benjamin;
- Dishman, Sarah;
- Fettinger, James;
- Sigman, Matthew;
- Shaw, Jared
Interactions between catalysts and substrates can be highly complex and dynamic, often complicating the development of models to either predict or understand such processes. A dirhodium(II)-catalyzed C-H insertion of donor/donor carbenes into 2-alkoxybenzophenone substrates to form benzodihydrofurans was selected as a model system to explore nonlinear methods to achieve a mechanistic understanding. We found that the application of traditional methods of multivariate linear regression (MLR) correlating DFT-derived descriptors of catalysts and substrates leads to poorly performing models. This inspired the introduction of nonlinear descriptor relationships into modeling by applying the sure independence screening and sparsifying operator (SISSO) algorithm. Based on SISSO-generated descriptors, a high-performing MLR model was identified that predicts external validation points well. Mechanistic interpretation was aided by the deconstruction of feature relationships using chemical space maps, decision trees, and linear descriptors. Substrates were found to have a strong dependence on steric effects for determining their innate cyclization selectivity preferences. Catalyst reactive site features can then be matched to product features to tune or override the resultant diastereoselectivity within the substrate-dictated ranges. This case study presents a method for understanding complex interactions often encountered in catalysis by using nonlinear modeling methods and linear deconvolution by pattern recognition.