Skip to main content
eScholarship
Open Access Publications from the University of California

UC Riverside

UC Riverside Electronic Theses and Dissertations bannerUC Riverside

Improving Multiple Sequence Alignments by Revising Sequence Families with Alignment Scoring Approaches

Abstract

Characterizing the functional, structural, and evolutionary relationships of biological sequences is an important task in modern genomics and computational biology. Most of these applications involve the assembly of sequence families by similarity searching, subsequent formation of multiple sequence alignments (MSAs) and downstream phylogenetic analyses. Especially, MSAs play a central role in this modeling workflow. Thus, the quality of the MSAs is of critical importance for its success. In this study I present an approach to improve the quality of MSAs by using a sequence family revision approach that can automatically remove false positive candidates from sequence families and then recompute an improved MSA. The approach is able to combine sequence-level scores from two MSA scoring methods, norMD and GUIDANCE. It automatically selects an optimized score threshold for removing sequences from MSAs. To test the performance of this method, I developed several automated procedures to add to curated MSAs from the CDD database controlled numbers of randomly selected nonmember sequences. Then I performed Receiver Operating Characteristic (ROC) analysis on the classification results incorporating automatic threshold selection approaches. Surprisingly, the sequence-level scores, provided by the two MSA scoring methods, were less successful than a simple all-against-all BLAST-based pairwise alignment scoring approach. However, I was able to improve one of the MSA scoring methods by extending it with a dynamic threshold selection approach. The extended method outperformed the performance of the BLAST-based method in detecting false positives in sequence families.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View