Skip to main content
eScholarship
Open Access Publications from the University of California

UC Davis

UC Davis Electronic Theses and Dissertations bannerUC Davis

Integrative Approaches in Machine Learning and Biology

Abstract

This dissertation delves into the rich intersection of machine learning (ML) and biology, a field marked by significant progress yet full of opportunities for further breakthroughs. The work presented here is divided into three main parts. The first two focus on employing ML to investigate reward-based learning in animals and to study protein-protein interactions within viruses. The third part shifts focus to explore the development of robust computer vision models, drawing inspiration from biological insights.

In the first part, we apply the reinforcement learning (RL) framework to gain insights into reward-based learning in animals. We explore how key neural circuits and neurotransmitter systems, specifically the prefrontal cortex, ventral striatum, and the dopaminergic pathways, contribute to the implementation of RL algorithms to learn the association of actions with delayed outcomes, known as the credit assignment problem. We first identify a distinct pattern of neural activity in the prefrontal cortical inputs to the ventral striatum: activity that is sequential over time and selective to a given choice. We then use computational modeling to show how these inputs provide an effective state representation for the ventral striatum, enabling it to calculate accurate value signals for each choice at any given time point. This is demonstrated through the implementation of two neural circuit models of reinforcement learning, where reward prediction error drives learning either by inducing rapid synaptic plasticity or by altering neural dynamics. Additionally, we test and confirm our circuit model predictions experimentally through direct manipulation of the input neurons to the ventral striatum.

In the second part, we conduct two computational studies of SARS-CoV-2 to understand its spike protein interactions and its implications in viral transmission and immune response evasion. To achieve this we employ two key computational tools: molecular dynamic simulations and AlphaFold2, an advanced deep learning model designed for predicting protein structures. The study is divided into two main parts. The first part examines the biophysical properties of the SARS-CoV-2 Omicron variants compared to the wild type and Delta variants, analyzing the spike protein binding to (i) the ACE2 receptor protein, (ii) antibodies from all known binding regions, and (iii) the furin binding domain. Our findings indicate that the Omicron variant shows reduced binding to the ACE2 receptor, but increased immune evasion, consistent with preliminary observations. The second part delves deeper into the interactions between the Furin Cleavage Domain (FCD) of SARS-CoV-2 variants and other coronaviruses with the furin enzyme. Here, we demonstrate that the Delta variant exhibits the strongest possible binding with the furin enzyme, and we identify key sequences, both observed and unobserved, that could exhibit similar binding strengths.

In the final part, we explore how integrating biological insights, particularly from the primary visual cortex area V1, can improve the robustness of Convolutional Neural Networks (CNNs) against various image corruptions. For this purpose, we utilize VOneNet, a hybrid CNN containing a model of V1 as the front-end, followed by a standard trainable CNN architecture. We first observe that different variants of the V1-inspired model exhibit performance trade-offs for different corruptions. Building on this, we develop a new model using an ensembling technique, which combines multiple individual models with different V1-inspired variants. This model effectively harnesses the strengths of each individual model, leading to significant improvements in robustness across all corruption categories. Further, we demonstrate that knowledge distillation can help compress the knowledge in the ensemble model into a single, more efficient V1-inspired model. Overall, we demonstrate that by merging the unique strengths of various neuronal circuits in V1 we can significantly enhance the robustness of CNNs against a wide array of perturbations.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View