Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Computational Methods for the Analysis of Genomic and Proteomic Sequences

Abstract

The rapid generation of biological sequences, such as nucleotide and amino acid sequences, has revolutionized the studies in the field of molecular biology. To name a few applications, DNA sequences generated by the RNA-Sequencing technology facilitate the studies of gene expression analysis; protein sequences represent the primary structure to predict protein-protein interactions. Moreover, the vast amount of sequence data generated from high-throughput technologies gears up the data analysis to the omics level. As a consequence, developing novel computational methods and tailoring existing algorithms are highly imperative to extract relevant and critical knowledge from sequence data.

In this dissertation, we introduce several computational frameworks that leverage the genomic sequences to quantify gene expression and utilize the proteomic sequences to characterize protein-protein interactions. The methodologies presented in these frameworks span different research areas, including feature extraction from string data, string matching for DNA sequence, statistical inference for expression quantification, and sequence-pair modeling through deep learning. As a result, these approaches not only tackle specific challenges in the applications mentioned above but also present the potentials to address issues in other sequencing applications.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View