Skip to main content
Open Access Publications from the University of California

Department of Biostatistics

Department of Biostatistics bannerUCLA


With over 25 faculty and 90 students, the Department of Biostatistics today is a leader in statistical training for academia, government and industry. Faculty members collaborate with investigators in an extremely large number of diverse disciplines, and as a result biostatistics students have ample research opportunities. Our research programs in Bayesian methods, causal inference, genetics, hierarchical models, HIV/AIDS, longitudinal data analysis, phylogeny, spatial statistics and geographical information systems, survival analysis, and optimal design are well-respected nationally and internationally. We continue to grow in terms of our faculty, students and programs to meet current and future needs.

Department of Biostatistics

There are 919 publications in this collection, published between 1999 and 2024.
Research Reports (55)

High-Dimensional Bayesian Geostatistics

With the growing capabilities of Geographic Information Systems(GIS) and user-friendly software, statisticians today routinely encounter geographicallyreferenced data containing observations from a large number of spatial locationsand time points. Over the last decade, hierarchical spatiotemporal processmodels have become widely deployed statistical tools for researchers to better understandthe complex nature of spatial and temporal variability. However, fittinghierarchical spatiotemporal models often involves expensive matrix computationswith complexity increasing in cubic order for the number of spatial locations andtemporal points. This renders such models unfeasible for large data sets. Thisarticle offers a focused review of two methods for constructing well-defined highlyscalable spatiotemporal stochastic processes. Both these processes can be used as“priors” for spatiotemporal random fields. The first approach constructs a lowrankprocess operating on a lower-dimensional subspace. The second approachconstructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparseprecision matrices for its finite realizations. Both processes can be exploited asa scalable prior embedded within a rich hierarchical modeling framework to deliverfull Bayesian inference. These approaches can be described as model-basedsolutions for big spatiotemporal datasets. The models ensure that the algorithmiccomplexity has ∼ n floating point operations (flops), where n the number of spatiallocations (per iteration). We compare these methods and provide some insightinto their methodological underpinnings.

52 more worksshow all
Open Access Policy Deposits (893)

Reproducible variability: assessing investigator discordance across 9 research teams attempting to reproduce the same observational study.

OBJECTIVE: Observational studies can impact patient care but must be robust and reproducible. Nonreproducibility is primarily caused by unclear reporting of design choices and analytic procedures. This study aimed to: (1) assess how the study logic described in an observational study could be interpreted by independent researchers and (2) quantify the impact of interpretations variability on patient characteristics. MATERIALS AND METHODS: Nine teams of highly qualified researchers reproduced a cohort from a study by Albogami et al. The teams were provided the clinical codes and access to the tools to create cohort definitions such that the only variable part was their logic choices. We executed teams cohort definitions against the database and compared the number of subjects, patient overlap, and patient characteristics. RESULTS: On average, the teams interpretations fully aligned with the master implementation in 4 out of 10 inclusion criteria with at least 4 deviations per team. Cohorts size varied from one-third of the master cohort size to 10 times the cohort size (2159-63 619 subjects compared to 6196 subjects). Median agreement was 9.4% (interquartile range 15.3-16.2%). The teams cohorts significantly differed from the master implementation by at least 2 baseline characteristics, and most of the teams differed by at least 5. CONCLUSIONS: Independent research teams attempting to reproduce the study based on its free-text description alone produce different implementations that vary in the population size and composition. Sharing analytical code supported by a common data model and open-source tools allows reproducing a study unambiguously thereby preserving initial design choices.

Accelerated epigenetic aging in Werner syndrome

Individuals suffering from Werner syndrome (WS) exhibit many clinical signs of accelerated aging. While the underlying constitutional mutation leads to accelerated rates of DNA damage, it is not yet known whether WS is also associated with an increased epigenetic age according to a DNA methylation based biomarker of aging (the "Epigenetic Clock"). Using whole blood methylation data from 18 WS cases and 18 age matched controls, we find that WS is associated with increased extrinsic epigenetic age acceleration (p=0.0072) and intrinsic epigenetic age acceleration (p=0.04), the latter of which is independent of age-related changes in the composition of peripheral blood cells. A multivariate model analysis reveals that WS is associated with an increase in DNA methylation age (on average 6.4 years, p=0.011) even after adjusting for chronological age, gender, and blood cell counts. Further, WS might be associated with a reduction in naïve CD8+ T cells (p=0.025) according to imputed measures of blood cell counts. Overall, this study shows that WS is associated with an increased epigenetic age of blood cells which is independent of changes in blood cell composition. The extent to which this alteration is a cause or effect of WS disease phenotypes remains unknown.

890 more worksshow all