UC San Diego
HIV-1 Sequence Analysis: Approaches and Applications from Transmission Network Inference to Epitope Discovery
- Author(s): Hepler, Nicolaus Lance
- Advisor(s): Kosakovsky Pond, Sergei L
- Richman, Douglas D
- et al.
Recent advances in sequencing technology, especially next-generation sequencing (NGS), have become central to the study of rapidly evolving pathogens such as HIV-1. However, one must contend with the massive amount of data produced by NGS as well as artifacts arising from sample preparation and sequencing. To deal with these complexities, numerous bioinformatics methods have appeared to enable researchers to extract the maximum value from their sequencing data. Unfortunately, the number of methods and the difficulty of benchmarking have limited their reach; there is an opportunity for benchmarked, easy-to-use pipelines to increase the exposure and use of more powerful techniques in lieu of naive approaches. Here I present one such easy-to-use pipeline for the analysis of NGS data, tailored to the analysis of HIV-1. I also present a method for leveraging NGS data in the inference of molecular transmission networks, demonstrating the increased power and resolution of these new technologies in revealing the central role of dual infection in the HIV-1 transmission network. Finally, I present IDEPI-a framework for the prediction of HIV-1 antibody epitopes from IC50-labeled env sequence data. IDEPI uses predictive modeling to solve the general problem of inferring genotypic bases for phenotypic characteristics, and is consequently also able to construct specialized genotype-to-phenotype predictors, enabling computational surveillance applications. IDEPI demonstrates state-of-the-art performance not just in HIV-1 antibody epitope prediction, but also in predicting HIV-1 co-receptor usage (tropism), computational surveillance of drug resistance, and identifying signatures of HIV-1 associated dementia.