Skip to main content
Efficient strategies for leave-one-out cross validation for genomic best linear unbiased prediction
Published Web Locationhttps://doi.org/10.1186/s40104-017-0164-6
BackgroundA random multiple-regression model that simultaneously fit all allele substitution effects for additive markers or haplotypes as uncorrelated random effects was proposed for Best Linear Unbiased Prediction, using whole-genome data. Leave-one-out cross validation can be used to quantify the predictive ability of a statistical model.
MethodsNaive application of Leave-one-out cross validation is computationally intensive because the training and validation analyses need to be repeated n times, once for each observation. Efficient Leave-one-out cross validation strategies are presented here, requiring little more effort than a single analysis.
ResultsEfficient Leave-one-out cross validation strategies is 786 times faster than the naive application for a simulated dataset with 1,000 observations and 10,000 markers and 99 times faster with 1,000 observations and 100 markers. These efficiencies relative to the naive approach using the same model will increase with increases in the number of observations.
ConclusionsEfficient Leave-one-out cross validation strategies are presented here, requiring little more effort than a single analysis.
For improved accessibility of PDF content, download the file to your device.