Regression analysis, a cornerstone of statistical methodology, confronts emerging challenges in the realm of complex, non-Euclidean data. As responses are more often found in a general metric space that lacks a vector space structure, conventional regression techniques tailored for Euclidean spaces face a substantial hurdle. This dissertation strategically addresses this critical gap by formulating efficient and theoretically rigorous regression techniques specifically designed for two prominent non-Euclidean data types: networks and probability measures.
By viewing each network as an individual data point that lies in the space of graph Laplacians, we introduce a unifying intrinsic framework for network-response regression. The proposed regression models extend both linear and local linear regression to network responses, taking the form of weighted Fréchet means. The practical utility and effectiveness of this framework are demonstrated through simulations and real-world data applications.
Moving beyond network-response regression, the dissertation advances novel regression models tailored for probability measures on vector covariates, especially when sample sizes vary and are limited for some measures. Direct utilization of empirical measures sidesteps challenges associated with density estimation, enabling the proposed models to perform exceptionally well despite variable sample sizes and limited data.
Finally, we extend the proposed regression framework for probability measures to survival analysis. Specifically, we introduce an innovative regression model designed for right-censored survival data across diverse populations. This approach models the underlying probability measure and the corresponding nonparametric Kaplan-Meier estimator for each subgroup as elements within the Wasserstein space of probability measures. The demonstrated superior performance in finite sample scenarios positions it as an advanced alternative to the Cox proportional hazards model.