Regression and optimal transport models for functional and surface-valued data

• Author(s): Liu, Xi
In Chapter 1, the effect of a smooth curve on a binary response is analyzed through a functional generalized linear model. The proposed method develops a novel approach under the assumption that the coefficient function $\beta(t)$ is truncated, i.e. one can expect that the curve predictor loses its influence after a timepoint in its domain. To achieve an estimate $\beta(t)$ that is simultaneously smooth and truncated, a structured variable selection method and localized B-spline expansion of $\beta(t)$ are leveraged to formulate a penalized log-likelihood function, where the nested group lasso penalty guarantees the sequential entering of B-splines and hence induces truncation in $\beta(t)$. Computationally, an optimization scheme is developed to compute the entire solution path effectively when varying the truncation tuning parameter from $\infty$ to 0. Unlike previous methods, which either directly penalized the value of the truncation point or resulted in a nonconvex optimization problem, the proposed approach utilizes a nested group lasso penalty and leads to a convex optimization problem. By expressing the nonsmooth lasso penalty in its dual formulation, it can be subsequently smoothed so that the objective function can be optimized by an accelerated gradient descent algorithm. Theoretically, the convergence rate of the estimate and consistency of the truncation point estimation are derived under suitable smoothness assumptions. The proposed method is demonstrated with an application involving the effects of blood pressure curves in patients who suffered a spontaneous intracerebral hemorrhage.
In Chapter 2, a set of computational tools is developed to perform inference for a regression model where density curves appear as functional response objects with vector predictors. For such models, inference is key to understand the importance of density-predictor relationships, and the uncertainty associated with the estimated conditional mean densities, defined as conditional Fr echet means under a suitable metric. Since the positive density curve has integral equal to one, the usual $L_p$ metric is not suitable to model density curves. Instead, using the Wasserstein geometry of optimal transport, we consider the Fr echet regression of density curve responses and develop tests for global and partial effects, as well as simultaneous confidence bands for estimated conditional mean densities. This dissertation focuses on the computational aspects of the proposed statistical inference methods. An R package was developed to promote the usage of Fr echet regression of density curve responses. The accuracy of these methods, including nominal size, power, and coverage, is assessed through simulations. Furthermore, the utility of the methodology is demonstrated via regression analysis of post-intracerebral hemorrhage hematoma densities and their associations with a set of clinical and radiological covariates.