Skip to main content
eScholarship
Open Access Publications from the University of California

UC Santa Cruz

UC Santa Cruz Electronic Theses and Dissertations bannerUC Santa Cruz

Accurate genome analysis with nanopore sequencing using deep neural networks.

Creative Commons 'BY' version 4.0 license
Abstract

Nanopore sequencing, commercialized by Oxford Nanopore Technology (ONT), is a high-throughput genome sequencing platform. Unlike traditional sequencing-by-synthesis methods, nanopore sequencing uses measured current signals to sense the nucleotide sequence flowing through the pore. The signal-to-base conversion process introduces unique error patterns, making it challenging to design methods that rely on hand-crafted features. Deep learning uses multiple layers to progressively learn complex patterns in the input data, making it suitable for genome analysis. In this dissertation research, I present methods I developed based on deep neural networks to improve genome inference with nanopore sequencing. First, I introduce haplotype-aware variant calling pipeline PEPPER-Margin-DeepVariant that produces state-of-the-art results for nanopore long-reads. Next, I demonstrate a pipeline to perform de novo assembly of eleven human genomes in nine days. Then I show the application of the methods to validate and correct errors in the first complete human genome assembly. Finally, I demonstrate the utility of PEPPER-Margin-DeepVariant paired with highly multiplexed nanopore sequencing for rapidly identifying disease-causing variants.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View