Machine Learning in Label-free Phenotypic Screening
- Author(s): Chen, Lifan Claire
- Advisor(s): Jalali, Bahram
- et al.
High-throughput multivariate sensing is essential for high-content cellular phenotypic screening, where large volume of data for detection and accurate classification of rare events are often required. Here, we introduce time-stretch quantitative phase imaging (TS-QPI), a high-throughput label-free imaging flow cytometer developed for big data acquisition and analysis in phenotypic screening. TS-QPI is able to capture quantitative optical phase and intensity images simultaneously, enabling high-content cell analysis, cancer diagnostics, personalized genomics, and drug development.
We further developed a complete machine learning pipeline that performs optical phase measurement, image processing, feature extraction, and classification. Multiple biophysical features such as morphological parameters, optical loss characteristics, and protein concentration are measured on individual cells. These biophysical measurements form a hyperdimensional feature space in which supervised learning is performed for cell classification. The technology is in clinical testing for blood screening and circulating tumor cell detection, as well as studying lipid accumulating algal strains for biofuel production. By integrating machine learning with high-throughput quantitative imaging, this system achieves record high accuracy in label-free cellular phenotypic screening and opens up a new path to data-driven diagnosis.
Furthermore, we demonstrated, for the first time, real-time image compression performed in the optical domain to solve the big data challenge created by ultrafast measurement systems. Many ultrafast and high-throughput data acquisition equipment, including TS-QPI, produce a large volume of data in a short time, e.g. tens of Terabytes per hour. Such a data volume and velocity place a burden on data acquisition, storage, and processing and calls for technologies that compress images in optical domain and in real-time. As a solution, we have experimentally demonstrated warped time stretch, which offers variable spectral-domain sampling rate, as well as the ability to engineer the time-bandwidth product of the signal's envelope to match that of the data acquisition systems. We also show how to design the kernel of the transform and specifically, the nonlinear group delay profile governed by the signal sparsity. Such a kernel leads to smart detection with nonuniform spectral resolution, having direct utility in improvement of data acquisition rate, real-time data compression, and enhancement of ultrafast data capture accuracy.