Skip to main content
eScholarship
Open Access Publications from the University of California

Predicting Progress in Shotgun Sequencing with Paired Ends

Abstract

Paired-end shotgun sequencing has become widely used for large-scale sequencing projects in recent years, including whole genome shot-gun sequencing and map-based BAC clone sequencing. Under this scheme, sequences from both ends of random clones are determined and assembled into sequence contigs. The sequence data and their linking information are used to construct clone maps in the form of scaffolds. In order to plan a cost-effective sequencing project utilizing such an approach, it is crucial to have knowledge of the expected project progress in relation to parameters such as insert size, clone lengths and redundancy. There has been a lack of theoretical analysis for the paired-end sequencing strategy due to the difficulty of correlated ends. Here we present a mathematical analysis for the progress of a sequencing project employing such a scheme. Formulae for various measures of the expected progress such as expected number and size of scaffolds are derived and assessed by Monte Carlo simulations for parameter sets used in the human genome project.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View