- Yeh, Andy Hsien-Wei;
- Norn, Christoffer;
- Kipnis, Yakov;
- Tischer, Doug;
- Pellock, Samuel J;
- Evans, Declan;
- Ma, Pengchen;
- Lee, Gyu Rie;
- Zhang, Jason Z;
- Anishchenko, Ivan;
- Coventry, Brian;
- Cao, Longxing;
- Dauparas, Justas;
- Halabiya, Samer;
- DeWitt, Michelle;
- Carter, Lauren;
- Houk, KN;
- Baker, David
De novo enzyme design has sought to introduce active sites and substrate-binding pockets that are predicted to catalyse a reaction of interest into geometrically compatible native scaffolds1,2, but has been limited by a lack of suitable protein structures and the complexity of native protein sequence-structure relationships. Here we describe a deep-learning-based 'family-wide hallucination' approach that generates large numbers of idealized protein structures containing diverse pocket shapes and designed sequences that encode them. We use these scaffolds to design artificial luciferases that selectively catalyse the oxidative chemiluminescence of the synthetic luciferin substrates diphenylterazine3 and 2-deoxycoelenterazine. The designed active sites position an arginine guanidinium group adjacent to an anion that develops during the reaction in a binding pocket with high shape complementarity. For both luciferin substrates, we obtain designed luciferases with high selectivity; the most active of these is a small (13.9 kDa) and thermostable (with a melting temperature higher than 95 °C) enzyme that has a catalytic efficiency on diphenylterazine (kcat/Km = 106 M-1 s-1) comparable to that of native luciferases, but a much higher substrate specificity. The creation of highly active and specific biocatalysts from scratch with broad applications in biomedicine is a key milestone for computational enzyme design, and our approach should enable generation of a wide range of luciferases and other enzymes.