Skip to main content
eScholarship
Open Access Publications from the University of California

QD Extension for Microbial Finishing

Abstract

The Production Quality Control (QC) produces a series of assemblies, screening shotgun reads for contamination by vector sequence, for low-quality reads, and, in the case of microbial projects, for contamination by eukaryotic sequence. In coordination with the Microbial Finishing group lead at the PGF, an assembly that is "good enough" is named the Quality-controlled Draft (QD) assembly. Once the QD assembly is produced, the project passes from Production QC to one of the finishing groups at the JGI. During the finishing process, misassemblies are resolved, gaps between contigs are closed, and confirming reads are gathered for thinly covered areas. Judging whether a project is ready for finishing is a complex task, and requires answering numerous questions, such as: Is sufficient read and clone coverage present? Will existing contig linking information (scaffolding information) allow the closing of gaps between contigs? Has high or low GC content contributed to difficult-to-sequence or difficult-to-clone areas? Do the libraries have unusual features (e.g., being absent from certain areas of the genome, or having widely-spread insert size distributions)? Are contig GC content distributions consistent with project GC content distribution? How many repeats are present in the genome? Have repeats contributed to misassemblies? In order to facilitate the answering of these questions, we undertook the QD Extension project. QD Extension enhances the existing microbial finishing pipeline with results from the Phrap, PGA and Arachne assemblers, with scaffold information, and with coherent, informative web reports. By providing such additional information in a browsable, web-based format, we help the Microbial Finishing group lead at PGF more accurately and quickly determine whether a project is ready for finishing. This project facilitated the identification of a set of metrics that help to shed light on the finishing task that lies ahead of the biologist assigned to finish a

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View