Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Achieving Multi-scale Cell Morphology Clustering using Machine Learning

No data is associated with this publication.
Abstract

Understanding the heterogeneity of cell types and gene expression in complex tissues is crucial for advancing single cell genomics. Spatial transcriptomics enhances this understanding by adding spatial context, enabling a more comprehensive view of cellular function and organization. Additionally, the morphology of a cell is known to influence gene expression, and gene expression, in turn, affects cell morphology, highlighting the intricate relationship between a cell’s physical structure and its molecular activity.

Despite efforts to apply morphogenomics to single cell spatial data, existing methods face significant challenges in efficiently scaling to entire datasets. To address this limitation, I developed an autoencoder using PyTorch, a Python machine learning package, capable of replicating a 64x64 image of a cell mask. By utilizing the encoded latent layer, which is significantly smaller in dimension, this approach allows for multi-scale clustering of cell morphologies. I demonstrate Xenium datasets, Tissuenet datasets, and a pool of other datasets that were made publicly accessible.

To promote recreation and usability, the autoencoder is designed to integrate seamlessly with Bento, a computational toolkit developed in our lab. By being part of a diverse portfolio of software analyses tools, it maximizes the functionality and accessibility of the tool for further research and analysis.

Main Content

This item is under embargo until September 17, 2026.