- Hao, Xiaoke;
- Luo, Huiyan;
- Krawczyk, Michal;
- Wei, Wei;
- Wang, Wenqiu;
- Wang, Juan;
- Flagg, Ken;
- Hou, Jiayi;
- Zhang, Heng;
- Yi, Shaohua;
- Jafari, Maryam;
- Lin, Danni;
- Chung, Christopher;
- Caughey, Bennett A;
- Li, Gen;
- Dhar, Debanjan;
- Shi, William;
- Zheng, Lianghong;
- Hou, Rui;
- Zhu, Jie;
- Zhao, Liang;
- Fu, Xin;
- Zhang, Edward;
- Zhang, Charlotte;
- Zhu, Jian-Kang;
- Karin, Michael;
- Xu, Rui-Hua;
- Zhang, Kang
The ability to identify a specific cancer using minimally invasive biopsy holds great promise for improving the diagnosis, treatment selection, and prediction of prognosis in cancer. Using whole-genome methylation data from The Cancer Genome Atlas (TCGA) and machine learning methods, we evaluated the utility of DNA methylation for differentiating tumor tissue and normal tissue for four common cancers (breast, colon, liver, and lung). We identified cancer markers in a training cohort of 1,619 tumor samples and 173 matched adjacent normal tissue samples. We replicated our findings in a separate TCGA cohort of 791 tumor samples and 93 matched adjacent normal tissue samples, as well as an independent Chinese cohort of 394 tumor samples and 324 matched adjacent normal tissue samples. The DNA methylation analysis could predict cancer versus normal tissue with more than 95% accuracy in these three cohorts, demonstrating accuracy comparable to typical diagnostic methods. This analysis also correctly identified 29 of 30 colorectal cancer metastases to the liver and 32 of 34 colorectal cancer metastases to the lung. We also found that methylation patterns can predict prognosis and survival. We correlated differential methylation of CpG sites predictive of cancer with expression of associated genes known to be important in cancer biology, showing decreased expression with increased methylation, as expected. We verified gene expression profiles in a mouse model of hepatocellular carcinoma. Taken together, these findings demonstrate the utility of methylation biomarkers for the molecular characterization of cancer, with implications for diagnosis and prognosis.