UC Santa Barbara
Joint Modeling of Longitudinal and Survival Data via Multivariate Mixed Effects State Space Model
- Author(s): Luo, Ya
- Advisor(s): Wang, Yuedong
- et al.
State space models are powerful in modeling dynamic processes and at the same time have clear interpretations. Due to their flexibility and interpretability, mixed effects state space models have been studied in the literature for the modeling of multivariate longitudinal data.
In a multivariate mixed effects state space model, the population effects and subject random deviations of any variable can be modeled by different stochastic processes. These processes can differ between variables, allowing great flexibility in the modeling. In addition, the model provides multiple ways to characterize interactions between the variables. However, the expensive computational cost is a major hindrance to the application of the mixed effects state space model to data with large numbers of individuals. Let $m$ be the number of individuals. The current most efficient version of the Kalman filter, the univariate treatment, has time complexity O($m^3$) and space complexity O($m^2$). The univariate treatment can handle only a few hundred individuals at a high computational cost.
We discover special structures in the Kalman filter of the mixed effects state space model and develop a new algorithm to exploit these structures. This reduces both time and space complexity to O($m$) and enables easy modeling of hundreds of thousands of individuals without parallel computing, although it is also highly parallelizable.
We further extend the mixed effects state space model to a joint modeling framework, in which a mixed effects state space model characterizes longitudinal data and a logistic regression models the survival probability. The true values of the longitudinal variables, modeled by the latent state of the state space model, are used as predictors in the logistic regression.
Our joint model can (i) characterize the evolution of longitudinal variables and interactions between them, (ii) model the relationship between the survival probability and longitudinal variables/external covariates, and (iii) perform online predictions for longitudinal variables and survival probability.
We develop another efficient algorithm for the computation of the maximum likelihood estimates of parameters in the joint model with time and space complexity both linear in $m$.