scImpute: accurate and robust imputation for single cell RNA-seq data
Published Web Locationhttps://www.nature.com/articles/s41467-018-03405-7
The emerging single cell RNA sequencing (scRNA-seq) technologies enable the investigation of transcriptomic landscapes at single-cell resolution. The analysis of scRNA-seq data is complicated by excess zero or near zero counts, the so-called dropouts due to the low amounts of mRNA sequenced within individual cells. Downstream analysis of scRNA-seq would be severely biased if the dropout events are not properly corrected. We introduce scImpute, a statistical method to accurately and robustly impute the dropout values in scRNA-seq data. ScImpute automatically identifies gene expression values affected by dropout events, and only perform imputation on these values without introducing new bias to the rest data. ScImpute also detects outlier or rare cells and excludes them from imputation. Evaluation based on both simulated and real scRNA-seq data on mouse embryos, mouse brain cells, human blood cells, and human embryonic stem cells suggests that scImpute is an effective tool to recover transcriptome dynamics masked by dropout events. scImpute is shown to correct false zero counts, enhance the clustering of cell populations and subpopulations, improve the accuracy of differential expression analysis, and aid the study of gene expression dynamics.