Center for Bioinformatics and Molecular Biostatistics
Validation in Genomics: CpG Island Methylation Revisited
- Author(s): Segal, Mark R
- et al.
In a recent article in PLoS Genetics, Bock et al., (2006) undertake an extensive computational epigenetics analysis of the ability of DNA sequence-derived features, capturing attributes such as tetramer frequencies, repeats and predicted structure, to predict the methylation status of CpG islands. Their suite of analyses appears highly rigorous with regard to accompanying validation procedures, employing stringent Bonferroni corrections, stratified cross-validation, and follow-up experimental verification. Here, however, we showcase concerns with the validation steps, in part ascribable to the genome scale of the investigation, that serve as a cautionary note and indicate the heightened need for careful selection of analytic and companion validation methods. A series of new analyses of the same CpG island methylation data helps illustrate these issues, not just for this particular study, but also analogous investigations involving high-dimensional predictors with complex between-feature dependencies.