UC San Diego
Modern Statistical Methods for Complex Survival Data
- Author(s): Hou, Jue
- Advisor(s): Xu, Ronghui
- et al.
With the booming of big complex data, various Statistical methods and Data Science techniques have been developed to retrieve valuable information from them.
The progress is slower with survival data due to the additional difficulty from censoring and truncation. Except for a few straightforward extensions, most modern learning methods have been absent in survival analysis for years since their invention. The theory on the survival version of those methods also falls further behind. There is a strong demand on computational efficient and theoretical reliable methods for big complex data with
in various Health related fields where immense resource has been poured into.
This thesis is devoted to incorporating censoring and truncation to state-of-art Statistical methodology and theory, to promote the evolution of survival analysis and support Medical research with up-to-date tools. In Chapter 1, I study the mixture cure-rate model with left truncation and right-censoring. We propose a Nonparametric Maximum Likelihood Estimation (NPMLE) approach to effectively handle the truncation issue. We adopt an efficient and stable EM algorithm. We are able to give a closed form variance estimator giving rise to valid inference. In Chapter 2, I study the estimation and inference for the Fine-Gray competing risks model with high-dimensional covariates. We develop confidence intervals based on a one-step bias-correction to an initial regularized estimator. We lay down a methodological and theoretical framework for the one-step bias-corrected estimator with the partial likelihood. In Chapter 3, I study the inference on treatment effect with censored time-to-event outcome while adjusting for high-dimensional covariates. We propose an orthogonal score method to construct honest confidence intervals for the treatment effect. With a slight modification, we obtain a doubly robust estimator extremely tolerant to both estimation inconsistency and volatility. All the methods in aforementioned chapters are tested through extensive numerical experiments
and applied on real data with authentic medical interests.