Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Previously Published Works bannerUC San Diego

Aro: A machine learning approach to identifying single molecules and estimating classification error in fluorescence microscopy images

Published Web Location

https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-015-0534-z
No data is associated with this publication.
Abstract

© 2015 Wu and Rifkin; licensee BioMed Central. Background: Recent techniques for tagging and visualizing single molecules in fixed or living organisms and cell lines have been revolutionizing our understanding of the spatial and temporal dynamics of fundamental biological processes. However, fluorescence microscopy images are often noisy, and it can be difficult to distinguish a fluorescently labeled single molecule from background speckle. Results: We present a computational pipeline to distinguish the true signal of fluorescently labeled molecules from background fluorescence and noise. We test our technique using the challenging case of wide-field, epifluorescence microscope image stacks from single molecule fluorescence in situ experiments on nematode embryos where there can be substantial out-of-focus light and structured noise. The software recognizes and classifies individual mRNA spots by measuring several features of local intensity maxima and classifying them with a supervised random forest classifier. A key innovation of this software is that, by estimating the probability that each local maximum is a true spot in a statistically principled way, it makes it possible to estimate the error introduced by image classification. This can be used to assess the quality of the data and to estimate a confidence interval for the molecule count estimate, all of which are important for quantitative interpretations of the results of single-molecule experiments. Conclusions: The software classifies spots in these images well, with > 95% AUROC on realistic artificial data and outperforms other commonly used techniques on challenging real data. Its interval estimates provide a unique measure of the quality of an image and confidence in the classification.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Item not freely available? Link broken?
Report a problem accessing this item