Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Computational Methods to Study Tandem Repeats in Human Genome and Complex Diseases

Abstract

A central goal in genomics is to identify genetic variations and their impact on underlying molecular changes that lead to disease. With the advances in whole genome sequencing, many studies have been able to identify thousands of genetic loci associated with human traits. These studies mainly focus on single-nucleotide variants (SNVs) and novel insertion and deletions in the genome, while ignoring more complex variants. Here, I consider the problem of genotyping Variable Number Tandem Repeats (VNTRs), composed of inexact tandem duplications of short (6-100 bp) repeating units that span 3% of the human genome.While some VNTRs are known to play a role in complex disorders (e.g. Alzheimer’s, Myoclonus epilepsy, and Diabetes), the majority of them have not been studied well due to computational difficulty in genotyping VNTRs on a large scale. Here, I will present our progress on developing efficient computational algorithms to profile VNTRs from high throughput sequencing data and identify possible variations within them. I applied our method to generate the largest catalog of VNTR genotypes to this date, which provides insights into the landscape of VNTR variations in different populations. I show the contribution of tandem repeats in mediating expression levels of key genes with known associations to neurological disorders and familial cancers, and argue the causality of this relation. Finally, I will describe our efforts to directly understand the impact of these variations on human phenotypes, which improves our understanding of genetic architecture of complex diseases.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View