The ability to create pluripotent stem cells (PSCs) from any tissue by a process called reprogramming (yielding induced pluripotent stem cells, iPSCs) has ushered in an era of personalized medicine. However, reprogramming protocols are not trivial and are nearly always inefficient, often yielding an efficiency of less than 0.1%. Similar low efficiencies occur for many of the forward differentiation protocols, where it is also a major question which cell types are made and how they compare to the cells present in vivo. In this work, we applied single cell RNA-sequencing on iPSC reprogramming from different somatic cells to define the transcriptional changes in the process and the role of the reprogramming factors and somatic TFs in the reorganization of cell identity, revealing the critical role of intermediate ectopic gene expression. Changes to the reprogramming transcription factor complement results in similar intermediates while skewing the number of cells that reached particular cell stages. Intriguingly, distinct transcription factors induced unique novel ectopic transient gene networks, the character of which influenced the efficiency of reprogramming. This work thoroughly describes the processes of cell-fate decision-making, and uncovers the nature of the ectopic gene expression state as a gate keeper of reprogramming progression.
Building on this, we also establish a novel computational method to deconvolve the epigenetic control of heterogeneous processes, such as reprogramming and differentiation, thereby uncovering mechanisms underlying cell type specifications and transitions. Using techniques from machine learning, we train models to learn the relationship between the transcriptome and epigenome from an atlas of homogeneous cell populations, then apply these models to single cell populations. Our results illustrate accurate deconvolution of a human fetal brain organoid for which we have predicted the H3K27ac epigenomic landscape, a histone modification mark that is nearly impossible to profile at single cell level.
Together, my graduate work has focused on developing novel computational methods and analysis techniques that leverage single cell genomics for studying gene regulation while gaining insight into the mechanisms underlying cell fate change processes, as well as how to effectively derive single cell type chromatin state data.