Skip to main content
eScholarship
Open Access Publications from the University of California

UC Santa Cruz

UC Santa Cruz Electronic Theses and Dissertations bannerUC Santa Cruz

Inference and Uncertainty Quantification for High-Dimensional Tensor Regression with Tensor Decompositions and Bayesian Methods

Creative Commons 'BY' version 4.0 license
Abstract

The recent emergence of complex datasets in various disciplines presents a pressing need to devise regression models that have tensors either as a response or as a covariate, often under assumptions of sparsity in the corresponding tensor-valued coefficients. Models that involve tensors often require special treatment in a modeling setting due to their potentially large structures and general assumptions of sparsity in regard to associations with covariates. Importantly, scenarios with small sample sizes benefit from Bayesian methods that allow for flexible model conditions while also rigorously defining the uncertainty in any conclusions drawn from a model.

We begin with general introductions to Bayesian analysis, neuroimaging, and tensor notations in the first chapter. The goal in these short overviews is not to provide a comprehensive background, but to inform the casual reader about key concepts that will be referenced throughout this dissertation. Afterwards, we proceed through new methods in Bayesian modeling of tensor-valued variables, covering three different analysis scenarios.

The goal in the second chapter is to develop a Bayesian tensor response regression in order to identify contiguous spatial regions that are associated with a given covariate. The method is then applied to detecting neuronal activation in functional magnetic resonance imaging (fMRI) experiments in the presence of tensor-valued brain images and a scalar predictor for a single subject. We propose to regress responses from all cells (called voxels in brain activation studies) together as a tensor response on scalar predictors, accounting for the structural information inherent in the tensor response. To estimate model parameters with proper cell specific shrinkage, we propose a novel \emph{multiway stick breaking shrinkage prior} distribution on tensor structured regression coefficients, enabling identification of cells which are related to the predictors. The major novelty of this chapter lies in the theoretical study of the contraction properties for the proposed shrinkage prior in the tensor response regression when the number of cells grows faster than the sample size. Specifically, estimates of tensor regression coefficients are shown to be asymptotically concentrated around the true sparse tensor in $L_2$-sense under mild assumptions. The method is then applied to a single subject within a balloon-analog risk-taking fMRI experiment to make inferences about parts of the subject's brain that are activated by a stimulus.

In the third chapter, the Bayesian tensor response regression is expanded to compare multiple subjects with multiple tensor responses per subject. This allows for inference on a tensor-valued coefficient, as well as correlations between the different tensor response groups. These two types of inference are referred to in neuroimaging as activation and connectivity, respectively. Brain activation and connectivity analyses in task-based fMRI experiments with multiple subjects are currently at the forefront of data-driven neuroscience. In such experiments, interest often lies in understanding activation of brain voxels due to external stimuli and strong association or connectivity between the measurements on a set of pre-specified groups of brain voxels, also known as regions of interest (ROI). This chapter proposes a joint Bayesian additive mixed modeling framework that simultaneously assesses brain activation and connectivity patterns from multiple subjects. In particular, fMRI measurements from each individual obtained in the form of a multi-dimensional array/tensor across time are regressed on functions of the stimuli. A low-rank parallel factorization (PARAFAC) decomposition on the tensor regression coefficients corresponding to the stimuli to achieve parsimony. The multiway stick-breaking shrinkage priors that were developed in the first chapter are employed to infer activation patterns and associated uncertainties in each cell within the tensor responses. Further, the model introduces region specific random effects which are jointly modeled with a Bayesian Gaussian graphical prior to account for the connectivity among pairs of ROIs. Empirical investigations under various simulation studies demonstrate the effectiveness of the method as a tool to simultaneously assess brain activation and connectivity. The method is then applied to the balloon-analog risk-taking fMRI experiment across multiple subjects in order to make inference about how the brain processes risk.

In the fourth chapter, we propose a method to parsimoniously model a scalar response with a tensor-valued covariate using the Tucker tensor decomposition. This method retains the spatial relationship within a tensor-valued covariate, while reducing the number of parameters varying within the model. Simulated data is analyzed to demonstrate model effectiveness, with comparisons made to both classical and Bayesian methods. The method is then applied to data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) to make inferences about the effects of Alzheimer's disease on the brain and to provide a more quantitative framework on which to make diagnoses.

Finally, we conclude with a brief review of the topics covered, research in progress, and future directions for scholastic pursuit.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View