Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Decomposition of Synechococcus elongatus transcriptomic data to reveal its regulatory modules through Independent Component Analysis

Abstract

Synechococcus elongatus is a tractable model cyanobacterium for circadian studies and a platform for bioproduction. The organism’s adaptation response to conditional changes in aquatic environments is orchestrated through the transcriptional regulatory network (TRN). Despite the previous characterization of constituent parts of the S. elongatus TRN, a system-level characterization, and analysis of the interactions between major transactional regulators have yet to be established. Here, we demonstrate the utility of unsupervised machine learning to compartmentalize and describe the characteristics of the different regulatory modules of the model strain S. elongatus PCC 7942, enabling a complete reconstruction of its TRN in response to environmental stresses and changes in intracellular states. Through the application of Independent component Analysis (ICA) to a collection of 317 transcriptomic samples, we obtained 51 independently modulated gene sets called “iModulons'', each of which explained a portion of the variance in the organism’s transcriptional response. iModulons serve as a knowledge tool to elucidate the transcriptional function and activation dynamics of previously undefined regulons while also describing the interaction between transcription factors in the TRN. Our data-driven analysis also provides, for the first time, a complete TRN reconstruction for S. elongatus with valuable functional context to expand the annotation of many hypothetical genes captured in our iModulon structure. This transcriptome-wide analysis of S. elongatus TRN informs future research on areas of possible genetic perturbations to manipulate its transcriptional regulation and optimize the engineering of this organism. A knowledge-driven database of all published high-quality RNA-seq data for S. elongatus to date is now available in iModulonDB.org.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View