Deep Learning in Cancer Biology
Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Deep Learning in Cancer Biology

No data is associated with this publication.
Abstract

Deep learning methods have significantly advanced the state of computer vision and natural language processing. Their ability to discover intricate patterns in ever-expanding datasets is critical in solving cancer biology problems. However, cancer biology poses unique challenges. Typical input data, such as tumor images and DNA sequences, have significantly different semantic contexts than the traditional datasets used to train the deep learning methods. Thus, it is infeasible to leverage large pre-trained networks and requires training from scratch. Moreover, these data types are not human readable, making it difficult to annotate the data and interpret what the model has learned. This thesis aims to resolve these challenges and solve three urgent cancer biology problems using deep learning methods.Cancer is mediated through various mechanisms. One such mechanism is circular extrachromosomal DNA (ecDNA), one of the primary drivers of oncogene amplification. EcDNA is prevalent across a wide variety of cancer types and leads to worse patient survival. Thus, there is a critical need for tools to study these genomic lesions. However, it is difficult to understand various facets of ecDNA just through sequence-based methods and requires image-based reconstructions. I first present ecSeg, a deep learning tool to reconstruct ecDNA in images of tumor cells in metaphase. EcSeg uses a fully convolutional network and traditional computer vision techniques to semantically segment ecDNA. EcSeg correlates these segmentations with amplification profiles to reveal ecDNA mechanics and their resistance to drug therapy.

To translate ecSeg to clinical practice, I present ecSeg-i to resolve the ecDNA status of interphase cells in cancer patient tissue. Tissue images primarily contain interphase cells in which the DNA is loosely wound, making it extremely challenging to distinguish ecDNA. EcSeg-i uses a DenseNet to determine the ecDNA status and amplification profiles of cancer patient tissue.

Lastly, I present DeepViFi to identify oncoviral infections in cancer genomes. Rapidly mutating oncoviruses, such as HPV, can infect the host and disrupt various biological pathways, sometimes causing hybrid human-viral ecDNA to appear. DeepViFi is a transformer-based tool which uses an openset framework to embed DNA reads and detect oncoviral infections in next-generation sequencing data.

Main Content

This item is under embargo until September 8, 2024.