Skip to main content
eScholarship
Open Access Publications from the University of California

UCSF

UC San Francisco Electronic Theses and Dissertations bannerUCSF

Metagenomic Protein Function Prediction using the SFLD and Thresholded Sequence Similarity Networks

Abstract

The Structure-Function Linkage Database (SFLD) is a database containing hierarchical classifications of enzymes that relates specific sequence-structure features to specific chemical properties. It contains a collection of tools and data for investigating sequence-structure-function relationships and hypothesizing function. Currently, users can query one or more “unknown” protein sequences against the database using Hidden Markov Model or BLAST, and be able to compare, classify, annotate against existing curated enzyme superfamilies, the largest grouping of proteins for which common ancestry can be inferred. Here we present a working pipeline that allows users to putatively assign functions to sequences derived from Metagenomics studies and to visualize relationships between these sequences and existing enzyme superfamilies using thresholded sequence similarity networks.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View