Comprehensive mapping of mammalian transcriptomes identifies conserved genes associated with different cell differentiation states
Published Web Locationhttps://doi.org/10.1101/022608
Cell identity (or cell state) is established via gene expression programs, represented by ???associated genes??? with dynamic expression across cell identities. Here we integrate RNA-seq data from 40 tissues and cell types from human, chimpanzee, bonobo, and mouse to investigate the conservation and differentiation of cell states. We employ a statistical tool, ???Transcriptome Overlap Measure??? (TROM) to first identify cell-state-associated genes, both protein-coding and non-coding. Next, we use TROM to comprehensively map the cell states within each species and also between species based on the cell-state-associated genes. The within-species mapping measures which cell states are similar to each other, allowing us to construct a human cell differentiation tree that recovers both known and novel lineage relationships between cell states. Moreover, the between-species mapping summarizes the conservation of cell states across the four species. Based on these results, we identify conserved associated genes for different cell states and annotate their biological functions. Interestingly, we find that neural and testis tissues exhibit distinct evolutionary signatures in which neural tissues are much less enriched in conserved associated genes than testis. In addition, our mapping demonstrate that besides protein-coding genes, long non-coding RNAs serve well as associated genes to indicate cell states. We further infer the biological functions of those non-coding associated genes based on their co-expressed protein-coding associated genes. Overall, we provide a catalog of conserved and species-specific associated genes that identifies candidates for downstream experimental studies of the roles of these candidates in controlling cell identity.