Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Previously Published Works bannerUCLA

scSampler: fast diversity-preserving subsampling of large-scale single-cell transcriptomic data.

Published Web Location

https://www.biorxiv.org/content/10.1101/2022.01.15.476407v1
No data is associated with this publication.
Creative Commons 'BY-NC-ND' version 4.0 license
Abstract

Summary

The number of cells measured in single-cell transcriptomic data has grown fast in recent years. For such large-scale data, subsampling is a powerful and often necessary tool for exploratory data analysis. However, the easiest random subsampling is not ideal from the perspective of preserving rare cell types. Therefore, diversity-preserving subsampling is required for fast exploration of cell types in a large-scale dataset. Here, we propose scSampler, an algorithm for fast diversity-preserving subsampling of single-cell transcriptomic data.

Availability and implementation

scSampler is implemented in Python and is published under the MIT source license. It can be installed by "pip install scsampler" and used with the Scanpy pipline. The code is available on GitHub: https://github.com/SONGDONGYUAN1994/scsampler. An R interface is available at: https://github.com/SONGDONGYUAN1994/rscsampler.

Supplementary information

Supplementary data are available at Bioinformatics online.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Item not freely available? Link broken?
Report a problem accessing this item