Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Drawing Biological Understanding From Machine Learning

Abstract

Large biological data, such as medical imaging and single-cell level genomic data, are rich sources of biological information. Machine learning is a tool to extract that information into a usable form, whether it be predictions for some prediction task or insights drawn from the model. We explore three applications of machine learning to biology. One is on using deep learning to perform metastatic cancer prognosis from CT images by predicting lesion-level risks. We use the lesion-level risks to show that the model captures clinically known indicators of risk. Next, we utilize the DeepLIFT and TF-MoDISco neural network interpretation techniques to understand how DNA shape affects transcription factor binding. Overall, we find that sequence features are more important for distinguishing bound sites, but that shape features can modulate binding affinity. Finally, we the test the CellOracle and SCENIC+ gene regulatory network inference frameworks in the context of reprogramming fibroblasts to pluripotent cells, to prioritize key factors in reprogramming and recover their effects on differentiation.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View