Inter- and intracellular dynamics of DNA methylation
- Author(s): Shoemaker, Robert Field
- et al.
CpG methylation in the human genome plays an important role in the transcriptional regulatory network. I have explored the biology of this epigenetic mark and I have developed tools that aid in methylation-related analyses. Although most of the human genome is methylated, regions involved in regulatory activity tend to be unmethylated. The methylation states of CpGs influence DNA-protein interactions. Since DNA methylation influences transcriptional regulation, DNA methylation signatures discriminate among phenotypes, such as cancerous cells versus healthy cells or differentiated cells versus pluripotent cells. To better study DNA methylation, I produced a new probe set, Cpg220k, used in targeted bisulfite sequencing by Kun Zhang's lab. This involved surveying the literature for experiments that found regions of variable DNA methylation and using other experimental data, such as DNaseI hypersensitivity data, that found candidates for likely differential methylation. I produced padlock probes based on maximizing the bp coverage of these regions while minimizing the cost of the experiment. Using targeted bisulfite sequencing, I developed analytical tools that allowed for inter- and intracellular analysis of CpG methylation data. I found that methylation signatures accurately classify ES, fibroblast, and iPS cell lines. Gene expression and methylation are negatively correlated at the TSS but they are weakly correlated further down- and upstream from the TSS. Further exploring fuzzily methylated CpGs (methylation frequencies between .25 and .75), I found regions that exhibited allele-specific methylation. Many of the ASM regions involved a SNP overlapping with a CpG site, thereby creating sequence dependent ASM. Other ASM regions were likely products of biological regulatory mechanisms. I then created a pipeline that utilized the tools I developed for my methylation analyses. I integrated the pipeline into a website (http:// wanglab.ucsd.edu/star/) so that other scientists can conveniently run these analyses on their methylation data. Due to the increasing scale of methylation experiments, I created the pipeline with the capability of handling massive data sets that can exceed several billion reads. I made the tools that were critical to my research publicly available in hopes to furthering the scientific community's understanding of DNA methylation