Skip to main content
Open Access Publications from the University of California

UC Santa Cruz

UC Santa Cruz Electronic Theses and Dissertations bannerUC Santa Cruz

Comparative Annotation Toolkit (CAT) - Simultaneous Clade and Personal Genome Annotation

Creative Commons 'BY-NC' version 4.0 license

The recent introductions of low-cost, long-read and read-cloud sequencing tech- nologies coupled with intense efforts to develop efficient algorithms have made affordable, high-quality de-novo sequence assembly a realistic proposition. The result is an explo- sion of new, ultra contiguous genome assemblies. To compare these genomes we need robust methods for genome annotation. I describe the fully open source Comparative Annotation Toolkit (CAT), which provides a flexible way to simultaneously annotate entire clades and identify orthology relationships. I show that CAT can be used to improve annotations on the rat genome, annotate the great apes, annotate a diverse set of mammals, and annotate personal, diploid human genomes. I demonstrate the resulting discovery of novel genes, isoforms and structural variants, even in genomes as well studied as the rat and great apes, and how these annotations improve cross-species RNA expression experiments.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View