Skip to main content
eScholarship
Open Access Publications from the University of California

454 Sequencing is an Effective Method for Gap Closure in Microbial Whole Genome Shotgun Sequencing

  • Author(s): Chen, Feng
  • Jett, Jamie
  • Kirton, Edward
  • Goltsmna, Eugene
  • Singan, Vasanth
  • Hack, Christopher
  • Smith, Douglas
  • Richardson, Paul
  • et al.
Abstract

The Department of Energy Joint Genome Institute (www.jgi.doe.gov) in Walnut Creek, CA is a high throughput DNA sequencing facility with a current throughput of approximately 3 billion Sanger base pairs per month. A major effort at JGI is the sequencing of microbial genomes of relevance to the DOE missions of carbon sequestration, bioremediation and energy production. The JGI Microbial Program and Community Sequencing Program together are responsible for the generation of sequencing data for over 400 microbial genomes. At the traditional Sanger sequencing side, JGI is running about 70 ABI sequencers on a 24/7 schedule and about 40 GE MegaBACE 4500 sequencers on a 24/5 schedule. JGI currently runs 2 Roche's GS20 instruments to supplement our traditional Sanger sequencing. Our current whole genome shotgun sequencing strategy is to sequence 3kb and/or 8kb shotgun libraries to a combined 4-8x draft coverage and to sequence fosmid ends to 1x sequence coverage with Sanger sequencing and to supplement that with 12-25x coverage with 454 sequencing platform depending on the sizes of the genomes. For new microbial genomes we initiated, 454 sequencing was carried out at the same time the shotgun cloning for Sanger sequencing started. 454 sequencing data was used to profile the genomes for G/C content, genome sizes and other features of the genomes. For existing microbial genome projects for which the Sanger sequencing data has already been generated, we have been adding 454 sequencing coverage at the finishing stage. 454 sequencing data was assembled by default Newbler assembler software package from 454 Life Sciences. The Newbler contigs were then fragmented and the quality and coverage information of the contigs was captured by in-house developed software tool packages. The fragmentation strategy we currently use is to cut the Newbler contigs into 750 bp fragments with 100 bp overlap. The overlapping fragments from Newbler contigs were finally assembled with Sanger sequencing data using the assembler(s) of our choice. The gaps and low quality areas in the final assembly were manually sequenced to JGI defined quality standard. At this point, the genome is ready for analysis and annotation. 454 sequencing technology is also used more directly in gap closure stage of the microbial genome whole genome shotgun sequencing. Gap spanning clones from multiple genomes were pooled together and the resulting DNA was subjected to 454 sequencing. The Newbler assembly results from pooled clone sequencing were added to final genome assemblies to fill the gaps.

Main Content
Current View