Evolution of the gene regulatory network controlling biofilm formation in Candida species
- Gunasekaran, Deepika
- Advisor(s): Nobile, Clarissa J;
- Ardell, David H
Abstract
Gene expression is one of the most fundamental processes in a cell, allowinggenetic information to be processed into functional products. Regulation of gene expression is determined by the underlying gene regulatory networks (GRNs). We can think of a GRN as a directed network with hub nodes denoting master regulators, which are DNA-binding proteins and target nodes denoting downstream genes whose expression is modulated. The directed network edges denote strength of the interaction between the master regulators and downstream target genes. High-throughput genomic and epigenomic data is used to construct GRNs and to estimate the robustness of these networks. In this study, we used the GRN underlying the formation of complex multicellular structures called biofilms, in the yeast species Candida albicans, to understand the mechanisms of GRN divergence between species and variation within species. Biofilm formation has evolved multiple times in the fungal tree of life and the ability to form biofilms is highly varied across species and strains. To study the effects of these variations on components of the GRN, we developed three standalone bioinformatic tools. The first tool was developed to infer the structure of GRNs across Candida species by identifying and annotating binding loci of master regulators. The second was used to identify mutations in the C. albicans population and the effect of these mutations on the network components. The third tool was used to estimate the evolutionary forces acting on the components of the biofilm GRN. Using these tools, we inferred the sources of genetic variations in the biofilm regulatory network components. We found that the motif preferences of the DNA-binding proteins are conserved across large evolutionary distances but their interaction with target genes is highly divergent. This is driven in part by mutations resulting in gains and losses of genomic regions where the DNA-binding proteins preferentially bind. Furthermore, mutations accumulate in segments of DNA-binding proteins that are required to interact with other hub proteins in the network. This affects the overall structure of the network both within and between species.