Stem cell biology has the potential to solve many problems related to disease modeling, and personalized regenerative medicine. Experimental research involving stem cells is often quantified via non-invasive microscopy imaging to observe morphological behavior and determine the underlying mechanisms of observed patterns. The image data collected from these experiments are very large and require extensive by-hand analysis by a skilled biologist to evaluate experimental outcomes. To this end, computer vision programs aimed at feature extraction and classification have become an indispensable tool to accelerate and standardize the analytical pipeline. Furthermore, deep learning has automated these programs to improve the accuracy and reliability of the results obtained from analysis.
There remain limitations imposed on deep learning that are a consequence of the unique circumstances under which the data are collected. For example, class imbalances, lack of data, ground-truth annotation, contiguous class boundaries, and multi-label images, to name a few. The goal of this thesis is to overcome these limitations through the novel use of biological features which inform and guide model design and implementation. The major works comprising this dissertation involve deep neural network-based methods for stem cell microscopy image classification. Specifically, these methods address complex biological dataset problems including contrastive learning for improving discrimination of overlapping features, unsupervised generative adversarial networks for supplementation of limited datasets, and semi-supervised, pseudo-labeling to overcome costly manual annotations. Paramount to the success of these projects is the use of domain knowledge in the form of biological relationships between image classes, temporal and dynamic features, and microscopy scale-space, which are exploited to improve classification performance and are shown to be indispensable in designing efficient and effective deep learning models.
These novel video-bioinformatics methods are implemented to draw inferences of experimental endpoints from a dataset of microscopy images concerned with determining the effects of nicotine on a Huntington’s disease iPSC model. Colony regions of interest are detected from raw, time-lapse microscopy images and sorted into four separate morphological classes that correspond to major cellular phenotypes. The proportions of these classes within the images over time provides insights into the growth and developmental changes that occur during these experiments. Deep learning is a powerful tool that automates the analytical pipeline and reveals key features of input images that can be used to model stem cell differentiation.