Skip to main content
eScholarship
Open Access Publications from the University of California

UC Irvine

UC Irvine Previously Published Works bannerUC Irvine

A Gradient Boosting Algorithm for Survival Analysis via Direct Optimization of Concordance Index

Abstract

Survival analysis focuses on modeling and predicting the time to an event of interest. Many statistical models have been proposed for survival analysis. They often impose strong assumptions on hazard functions, which describe how the risk of an event changes over time depending on covariates associated with each individual. In particular, the prevalent proportional hazards model assumes that covariates are multiplicatively related to the hazard. Here we propose a nonparametric model for survival analysis that does not explicitly assume particular forms of hazard functions. Our nonparametric model utilizes an ensemble of regression trees to determine how the hazard function varies according to the associated covariates. The ensemble model is trained using a gradient boosting method to optimize a smoothed approximation of the concordance index, which is one of the most widely used metrics in survival model performance evaluation. We implemented our model in a software package called GBMCI (gradient boosting machine for concordance index) and benchmarked the performance of our model against other popular survival models with a large-scale breast cancer prognosis dataset. Our experiment shows that GBMCI consistently outperforms other methods based on a number of covariate settings. GBMCI is implemented in R and is freely available online.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View