Computational Prediction of Transcriptional Influence
- Author(s): Cary, Michael Patrick
- Advisor(s): Kenyon, Cynthia J
- et al.
Genome-wide expression measurements remain difficult to interpret. Two major challenges lie in drawing firm conclusions from hundreds or even thousands of significantly changing genes, and in deriving hypotheses from the data that merit further testing. Identifying the degree to which each gene regulator acts to increase or decrease the expression of each gene, a concept I refer to as transcriptional influence, would greatly increase our ability to make sense of these data.
This work describes a new method to calculate the transcriptional influence that each regulatory motif in a de novo predicted set has on each gene represented in a gene expression measurement platform, using only a compendium of data from the platform and genome sequence information. The method uses independent component analysis (ICA) first to generate genetic regulatory modules, and then to predict DNA sequence motifs (putative regulatory sites) that are enriched in these modules. In a final step, the relative membership of each gene in each gene module and the enrichment of each sequence motif in each module are used to predict the relative influence of each sequence motif on each gene.
The power of these predictions is demonstrated in the analysis of microarray data for several C. elegans variants, including isp-1 and hif-1 mutants. isp-1 mutations extend lifespan through the HIF-1 transcription factor, but there is no meaningful overlap among significant genes in hif-1 and isp-1 microarray datasets. In contrast, our method reveals extensive similarity in gene expression at a deeper level. Moreover, a regulatory motif predicted to have a strong influence in both datasets matches the canonical HIF-1 binding site.