- Zhu, Danqing;
- Brookes, David;
- Busia, Akosua;
- Carneiro, Ana;
- Fannjiang, Clara;
- Popova, Galina;
- Shin, David;
- Donohue, Kevin;
- Lin, Li;
- Miller, Zachary;
- Williams, Evan;
- Chang, Edward;
- Nowakowski, Tomasz;
- Schaffer, David;
- Listgarten, Jennifer
Adeno-associated viruses (AAVs) hold tremendous promise as delivery vectors for gene therapies. AAVs have been successfully engineered-for instance, for more efficient and/or cell-specific delivery to numerous tissues-by creating large, diverse starting libraries and selecting for desired properties. However, these starting libraries often contain a high proportion of variants unable to assemble or package their genomes, a prerequisite for any gene delivery goal. Here, we present and showcase a machine learning (ML) method for designing AAV peptide insertion libraries that achieve fivefold higher packaging fitness than the standard NNK library with negligible reduction in diversity. To demonstrate our ML-designed librarys utility for downstream engineering goals, we show that it yields approximately 10-fold more successful variants than the NNK library after selection for infection of human brain tissue, leading to a promising glial-specific variant. Moreover, our design approach can be applied to other types of libraries for AAV and beyond.