Skip to main content
eScholarship
Open Access Publications from the University of California

UC Irvine

UC Irvine Electronic Theses and Dissertations bannerUC Irvine

On Semi-Parametric Regression for Time-to-Event Analyses in Electronic Health Records Studies

Creative Commons 'BY-NC-ND' version 4.0 license
Abstract

Electronic health records (EHRs) have become a powerful resource for studying health outcomes. In time-to-event settings, EHRs are usually subject to interval censoring on the true time of the event. It is common to consider the outcome to be the time of diagnosis. Standard survival analysis tools for right-censored data, such as the Cox proportional hazards model, are commonly used to estimate covariate associations with the time-to-event in such settings. Patients may, however, have access to multiple health care providers across different systems. If patients seek care from external health systems (a phenomenon we call system migration), the diagnosis times within the observed system may be erroneously prolonged. No work has considered the performance of the Cox model under system migration. In this dissertation, we show that system migration related to the outcome of interest results in biased estimates of hazard ratios from the Cox model. We develop an extension to the Cox model that adjusts for system migration by 1) estimating the probability of system migration for each patient and 2) uses multiple imputation to adjust diagnosis times for patients identified as migrating across systems. A vital part of this method involves developing a prediction model from patient-specific system usage patterns for estimating the probability of system migration. To improve prediction assessment, we develop an estimator for time-dependent sensitivity and specificity in the recurrent event setting with unbalanced data across subpopulations. Finally, we consider the choice of time scales for assessing the relative risk of disease diagnosis. We compare the appropriateness of two commonly assumed time scales that define risk sets in the Cox model: the age time scale and the time-on-study time scale. Previous research has suggested that the age time scale, corresponding to birth as a time origin, is most appropriate for epidemiological studies. However, simulation studies have suggested that the time-on-study time scale with covariate adjustment for baseline age is more robust to misspecification of the time scale. We investigate the performance of the Cox model under each time scale under varying degrees of model misspecification and further assess the robustness of each approach when modeling time-varying covariates.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View