Scalable Methods for Big Time-To-Event Data
Computational advancements and cost efficiency over the recent years have made big data readily available to researchers. In the biomedical and public health fields analyzing time-to-event data, where the outcome of interest is a time-to-event endpoint, is of particular interest. However, big time-to-event data poses many challenges to currently-available statistical methods due to the large number of covariates and/or observations one can observe. In this dissertation we propose scalable sparse regression methods for both big right-censored and competing risks time-to-event data. We extend the recently-introduced broken adaptive ridge (BAR) regression procedure to both the Cox (1972) proportional hazards for right-censored data and the Fine and Gray (1999) proportional subdistribution hazards model for competing risks data, establish its large-sample properties under diverging dimension, and develop computational software that is scalable to big time-to-event data.