# Your search: "author:Shahbaba, Babak"

## filters applied

## Type of Work

Article (65) Book (0) Theses (7) Multimedia (0)

## Peer Review

Peer-reviewed only (72)

## Supplemental Material

Video (0) Audio (0) Images (0) Zip (0) Other files (0)

## Publication Year

## Campus

UC Berkeley (0) UC Davis (0) UC Irvine (72) UCLA (4) UC Merced (0) UC Riverside (0) UC San Diego (0) UCSF (2) UC Santa Barbara (0) UC Santa Cruz (35) UC Office of the President (5) Lawrence Berkeley National Laboratory (0) UC Agriculture & Natural Resources (0)

## Department

Donald Bren School of Information and Computer Sciences (65) Department of Computer Science (64) Department of Statistics (59)

Samueli School of Engineering (11) Biomedical Engineering (10) Civil and Environmental Engineering (1) Electrical Engineering and Computer Science (1)

## Journal

## Discipline

Medicine and Health Sciences (6) Life Sciences (1)

## Reuse License

BY - Attribution required (19) BY-NC-ND - Attribution; NonCommercial use; No derivatives (1) BY-NC-SA - Attribution; NonCommercial use; Derivatives use same license (1)

## Scholarly Works (72 results)

The availability of massive computational resources has led to a wide-spread application and development of Bayesian methods. However, in recent years, due to the explosive growth of data volume, developing advanced Bayesian methods for large-scale problems is still a very active area of research. This dissertation is an effort to develop more scalable computational tools for Bayesian inference in big data problems.

At its core, Bayesian inference involves evaluating high dimensional integrals with respect to the posterior distribution of model parameters and/or latent variables. However, the integration does not have closed form in general, and approximation methods are usually the only feasible option. Approximation can be divided into two main categories: deterministic approximation based on variational optimization, and stochastic approximation based on sampling methods.

We start with developing a new variational framework --- geometric approximation of posterior (GAP) --- based on ambient Fisher geometry. As a variational method, GAP has the potential to scale well to large problems compared to computationally expensive sampling methods. It not only has a well-established mathematical basis --- information geometry, but also works as a better alternative to other variational methods such as variational free energy and expectation propagation under certain scenarios.

Next, we focus on another class of approximation scheme based on MCMC sampling. Our method combines auto-encoders with Hamiltonian Monte Carlo (HMC). While HMC is efficient in exploring parameter space with high dimension or complicated geometry, it is computationally demanding since it has to evaluate additional geometric information of the parameter space. Our proposed method, Auto-encoding HMC, is designed to simulate Hamiltonian dynamics in a latent space with a much lower dimension, while still maintaining efficient exploration of the original space. Our method achieves a good balance between efficiency and accuracy for high-dimensional problems.

Besides our work on scalable approximation methods for Bayesian inference, we have also developed a variational auto-encoder (VAE) model based on determinantal point process (DPP) for big data classification problems with imbalanced classes. VAE is a generative model based on variational Bayes and is typically applied to high-dimensional data such as images and texts. In the presence of imbalanced data, our method balances the latent space by using a DPP prior to up-weight the minor classes. We successfully applied our method, henceforth called DPP-VAE, to neural data classification and hand-written digits generation, which are both high-dimensional in nature. Our method provides better results compared to standard VAE when datasets have imbalanced classes.

For an individual to successfully complete the task of decision-making, a set of temporally-organized events must occur: stimuli must be detected,

potential outcomes must be evaluated, behaviors must be executed or inhibited, and outcomes

(such as reward or punishment) must be experienced. Due to the complexity of this process,

it is very likely the case that decision-making is encoded by the temporally-precise interactions

among a population of neurons. Most existing statistical models, however, are inadequate for analyzing such sophisticated phenomenon as they either analyze a small number of neurons (e.g., pairwise analysis) or only provide an aggregated measure of interactions by assuming a constant dependence structure among neurons over time.

We start by proposing a scalable hierarchical semi-parametric Bayesian model to capture dependencies among multiple neurons by detecting their co-firing (possibly with some lag time). To this end, we model the spike train ( sequence of 1's (spike) and 0's (silence) ) for each neuron using the logistic function of a continuous latent variable with a Gaussian Process prior. Then we model the joint probability distribution of multiple neurons as a function of their corresponding marginal distribution using a parametric copula model. Our approach provides a flexible framework for modeling the underlying firing rates of each neuron. It also also allows us to make inference regarding both contemporaneous and lagged synchrony. We evaluate our approach using several simulation studies and apply it to analyze real data collected from an experiment designed for investigating the role of the prefrontal cortex of rats in reward-seeking behaviors.

Next, we propose a non-stationary Bayesian model to capture the dynamic nature of neuronal activity (such as the time-varying strength

of the interactions among neurons). Our proposed method yields results that provide new insights into the dynamic nature of population coding in the prefrontal cortex during decision making. In our analysis, we note that while some neurons in the prefrontal cortex do not synchronize their firing activity until the presence of a reward, a different set of neurons synchronize their

activity shortly after the onset of stimulus. These differentially synchronizing sub-populations of

neurons suggests a continuum of population representation of the reward-seeking task. Our analyses also suggest that the degree of synchronization differs between the

rewarded and non-rewarded conditions.

Finally we propose a novel statistical model for detecting neuronal communities involved in decision-making process. Our method characterizes the non-stationary activity of multiple neurons during a basic cognitive task by modeling their joint probability distribution dynamically. Our proposed model can capture the time-varying dependence structure among neurons while allowing the neuronal activity to change over time. This way, we are able to identify time-varying neuronal communities. By identifying communities of neurons that vary under different decisions, we expect our method to provide insights into the decision-making process in particular as well as into a broad range of cognitive functions.

This dissertation is an investigation into the intersections between differential geometry and Bayesian analysis. The former is the mathematical discipline that underlies our understanding of the spatial structure of the universe; the latter is the unified framework for statistical inference built upon the language of probability and the elegant Bayes' theorem. Here, the two disciplines are combined with the hope that a synergy might emerge and facilitate the useful application of Bayesian inference to real-world science. In particular, dynamic and high-dimensional neural data provides a challenging litmus test for the methods developed herein.

A major component of this work is the development and application of probabilistic models defined over smooth manifolds: dependencies between time series are modeled using the manifold of Hermitian positive definite matrices; probability density functions are modeled using the infinite sphere; and high-dimensional data are modeled using the Stiefel manifold of orthonormal matrices. Whereas formulating a manifold-based model is not difficult---in a certain sense, the geometry occurs a priori in each of the cases considered---the non-trivial geometry presents computational challenges for model-based inference. Hence, this thesis contributes two new algorithms for Bayesian inference on Riemannian manifolds. The first is an algorithm for inference over general Riemannian manifolds and is applied to inference on Hermitian positive definite matrices. The second is an algorithm for inference over manifolds that are embedded in Euclidean space and is applied to inference on the sphere and Stiefel manifolds.

This dissertation is ordered as follows. In Chapter 1, the general setting is introduced along with the rudiments of Riemannian geometry. In Chapter 2, the geodesic Lagrangian Monte Carlo algorithm is presented and used for Bayesian inference over the space of Hermitian positive definite matrices to learn the spectral densities of multivariate time series arising from local field potentials in a rodent brain. In Chapter 4, an alternative, conceptually simpler version of the geodesic Monte Carlo is developed, but the new algorithm requires differentiating the pseudo determinant, the derivative of which is derived in Chapter 3. In Chapter 5, the geometry of the infinite-dimensional sphere is leveraged for Bayesian nonparametric density estimation. In Chapter 6, high-dimensional spike trains and local field potentials in a rodent brain are used to predict environmental stimuli. This Bayesian `neural decoding' is facilitated by both geometric and non-geometric models. Chapter 7 charts the frontiers of Bayesian inference on infinite manifolds.

The hippocampus plays a crucial role in organizing the memory of daily events, yet unraveling its mechanisms poses challenges. Decoding the information encoded in the hippocampus is particularly challenging due to sparse neuron activity in non-spatial tasks. However, accurate decoding is crucial for understanding how the hippocampus represents and processes information. This work focuses on decoding neural activity using multivariate point processes, specifically Poisson and Hawkes processes. The first two chapters concentrate on Hawkes processes, which are well-suited for modeling history-dependent phenomena characterized by clustered events, such as the spiking activity in the brain. The study introduces several self-exciting/inhibiting Hawkes process models that effectively capture the interactions among an ensemble of neurons in the hippocampus of rats. The study demonstrates that this approach enables more accurate decoding with significantly lower computational complexity. In the final chapter, we present two novel models for encoding and de- coding neural activity using non-homogeneous Poisson processes. The encoding model captures neuronal dynamics in response to stimuli, revealing temporal variations in neuronal activity over time. On the other hand, the decoding model focuses on decoding the underlying stimulus patterns from the spike train data, leveraging the estimated firing rates from the encoding model. These models demonstrate improved decoding accuracy, emphasizing the importance of incorporating temporal dynamics in decoding stimulus patterns from neural activity.