In biological systems, proteins serve as the fundamental molecular machines driving key processes required for the survival and health of the host organism. Proteins carry out a wide range of functions, from immune defense to cellular signaling. These functions evolved over millions of years, refined by the gradual selection and accumulation of beneficial mutations. Artificial evolution techniques are often used to accelerate the development of novel functions for applications across biotechnology and medicine. The emergence of machine learning-guided engineering approaches promises accelerated and improved development of proteins.
In this dissertation, I explore the use of machine learning-guided engineering approaches for proteins and ultimately, develop a more efficient and effective machine learning-guided protein engineering approach. In Chapter 2, proof-of-concept work demonstrates the utility of a novel approach called low-N protein engineering for Cas proteins and the feasibility of a LSR-based screening platform to build comprehensive Cas protein datasets for machine learning model benchmarking. In Chapter 3, machine learning-guided engineering is applied to further improve upon existing engineering for a large serine recombinase. Finally, Chapter 4 presents a novel machine learning-guided, end-to-end framework for engineering hyperactive combinatorial protein variants. Together, this dissertation provides key insights towards applying machine learning to protein engineering, exemplifies the potential for machine learning to significantly enhance protein function, and displays a journey of learning how to be an effective protein engineer in the age of artificial intelligence.
Cookie SettingseScholarship uses cookies to ensure you have the best experience on our website. You can manage which cookies you want us to use.Our Privacy Statement includes more details on the cookies we use and how we protect your privacy.