DNA sequencing by denaturation
Genome sequencing technologies are in high demand for applications such as gene expressions, the studies of complex diseases, and personalized medicine. In this thesis, I present my work on a new DNA sequencing method called sequencing by denaturation (SBD). A Sanger sequencing reaction is performed on the templates on a surface to generate a ladder of DNA fragments randomly terminated by fluorescently-labeled dideoxyribonucleotides. The labeled DNA fragments are sequentially denatured and the process is monitored by measuring the change in fluorescence intensities from the surface. By analyzing the denaturation profiles, the base sequences of the templates can be determined in a massive parallel manner. Using thermodynamic principles, we simulated the denaturation profiles of a series of oligonucleotides ranging from 12 to 32 bases and developed a base-calling algorithm to decode the sequences. These simulations demonstrate that up to 20 bases from a DNA molecule can be sequenced by SBD. The instrumentation for performing SBD has been constructed by integrating fluorescence imaging, temperature control, and fluidics onto a single device through a custom-made biochemical reaction chamber. This system is fully automated and its performance has been characterized. It can be useful for many applications utilizing high-throughput fluorescence imaging. Experimental proof of concept for SBD was established by measuring denaturation curves of 6 fluorescently-labeled oligonucleotides hybridized to a common template on the surface. The melting temperature of each oligonucleotide was distinguished correctly. These results demonstrate that experimental measurements of denaturation profiles can be performed on a surface with single-base resolution, which proves the feasibility of SBD. The throughput of the system was calculated. It can potentially allow up to 200 million DNA templates to be sequenced within 7̃20 hours producing 4.2 billion base sequences in a single run. The cost to sequence a mammalian genome is estimated to be about 1000 US dollars. The potential limitations and methods for further improvement of SBD are discussed. With its high throughput and simplicity, SBD could potentially result in a significant increase in speed and reduction in cost in large-scale genome re-sequencing.