Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Methods for studying the genome-wide landscape of tandem repeats

Abstract

Tandem Repeats (TRs) are a class of genetic variants formed by motifs of 1-20 nucleotides repeating in tandem. Previous studies show that expansion at specific TR loci is the leading cause of dozens of Mendelian disorders such as Huntington's disease and Fragile X syndrome. Furthermore, copy numbers at TR loci are correlated with complex traits such as gene expression. Tandem repeats are highly mutable and therefore a great subject to study genetic diversity. However, current bioinformatics pipelines are often incapable of processing these loci accurately. Challenges in sequencing, alignment, and interpretation have led to TR loci being overlooked in many studies. We have created a method for genome-wide genotyping of TRs and a toolkit for processing, filtering, and quality control of TR callsets. These methods have allowed us and the community to study repeat expansions on a genome-wide scale. In addition, we have applied our work to study de-novo variants contributing to Autism Spectrum Disorder risk and have found multiple candidate TRs. Another application of our methods is the novel tool for creating an ensemble callset of TRs across a large population. Our efforts in creating methods and applying them to various applications have allowed us to gain a better understanding of TRs and their genetic diversity on a population scale.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View