Skip to main content
eScholarship
Open Access Publications from the University of California

Department of Statistics, UCLA

Open Access Policy Deposits bannerUCLA

This series is automatically populated with publications deposited by UCLA Department of Statistics researchers in accordance with the University of California’s open access policies. For more information see Open Access Policy Deposits and the UC Publication Management System.

Cover page of Introduction to Special Edition: The Future of the Textbook

Introduction to Special Edition: The Future of the Textbook

(2013)

A brief overview of the papers and commentaries in this special edition.

Cover page of A Statistical Analysis of Santa Barbara Ambulance Response in 2006: Performance Under Load

A Statistical Analysis of Santa Barbara Ambulance Response in 2006: Performance Under Load

(2009)

Ambulance response times in Santa Barbara County for 2006 are analyzed using point process techniques, including kernel intensity estimates and K-functions. Clusters of calls result in significantly higher response times, and this effect is quantified. In particular, calls preceded by other calls within 20 km and within the previous hour are significantly more likely to result in violations. This effect appears to be especially pronounced within semi-rural neighborhoods.

[WestJEM. 2009;10:42-47.]

A Bayesian Model for 20th Century Antarctic Sea Ice Extent Reconstruction

(2024)

Abstract: Antarctic sea ice, a key component in the complex Antarctic climate system, is an important driver and indicator of the global climate. In the relatively short satellite‐observed period from 1979 to 2022 the sea ice extent has continuously increased (contrasting a major decrease in Arctic sea ice) up to a dramatic decrease between 2014 and 2017. Recent years have seen record sea ice lows in February 2022–February 2023. We use a statistical ensemble reconstruction of Antarctic sea ice to put the observed changes into the historical context of the entire 20th century. We propose a seasonal Vector Auto‐Regressive Moving Average (VARMA) model fit in a Bayesian framework using regularized horseshoe priors on the regression coefficients to create a stochastic ensemble reconstruction of monthly Antarctic Sea ice extent from 1900 to 1979. This novel model produces a set of 2,500 plausible sea ice extent reconstructions for the sea ice by sector that incorporate the autocorrelation structure of sea ice over time as well as the dependence of sea ice between the sectors. These fully observed reconstructions exhibit plausible month‐to‐month changes in reconstructed sea ice as well as plausible interactions between the sectors and the total. We reconstruct an overall higher sea ice extent earlier in the 20th century with a relatively sharp decline in the 1970s. These trends agree well with previous reconstructions of Antarctic sea ice based on ice core data, whaling locations, and climatological data, as well as early satellite observations in the reconstruction period.

Cover page of Multivariate spatiotemporal functional principal component analysis for modeling hospitalization and mortality rates in the dialysis population.

Multivariate spatiotemporal functional principal component analysis for modeling hospitalization and mortality rates in the dialysis population.

(2024)

Dialysis patients experience frequent hospitalizations and a higher mortality rate compared to other Medicare populations, in whom hospitalizations are a major contributor to morbidity, mortality, and healthcare costs. Patients also typically remain on dialysis for the duration of their lives or until kidney transplantation. Hence, there is growing interest in studying the spatiotemporal trends in the correlated outcomes of hospitalization and mortality among dialysis patients as a function of time starting from transition to dialysis across the United States Utilizing national data from the United States Renal Data System (USRDS), we propose a novel multivariate spatiotemporal functional principal component analysis model to study the joint spatiotemporal patterns of hospitalization and mortality rates among dialysis patients. The proposal is based on a multivariate Karhunen-Loéve expansion that describes leading directions of variation across time and induces spatial correlations among region-specific scores. An efficient estimation procedure is proposed using only univariate principal components decompositions and a Markov Chain Monte Carlo framework for targeting the spatial correlations. The finite sample performance of the proposed method is studied through simulations. Novel applications to the USRDS data highlight hot spots across the United States with higher hospitalization and/or mortality rates and time periods of elevated risk.

Cover page of Anthropogenic Intensification of Cool‐Season Precipitation Is Not Yet Detectable Across the Western United States

Anthropogenic Intensification of Cool‐Season Precipitation Is Not Yet Detectable Across the Western United States

(2024)

Abstract: The cool season (November–March) of 2022–2023 was exceptional in the western United States (US), with the highest precipitation totals in ≥128 years in some areas. Recent precipitation extremes and expectations based on thermodynamics motivate us to evaluate the evidence for an anthropogenic intensification of western US cool‐season precipitation to date. Over cool seasons 1951–2023, trends in precipitation totals on the wettest cool‐season days were neutral or negative across the western US, and significantly negative in northern California and parts of the Pacific Northwest, counter to the expected net intensification effect from anthropogenic forcing. Multiple reanalysis data sets indicate a corresponding lack of increase in moisture transports into the western US, suggesting that atmospheric circulation trends over the North Pacific have counteracted the increases in atmospheric moisture expected from warming alone. The lack of precipitation intensification to date is generally consistent with climate model simulations. A large ensemble of 648 simulations from 35 climate models suggests it is too soon to detect anthropogenic intensification of precipitation across much of the western US. In California, the 35‐model median time of emergence for intensification of the wettest days is 2080 under a mid‐level emissions scenario. On the other hand, observed reductions of precipitation extremes in California and the Pacific Northwest are near the lower edge of the large ensemble of simulated trends, calling into question model representation of western US precipitation variability.

scCDC: a computational method for gene-specific contamination detection and correction in single-cell and single-nucleus RNA-seq data

(2024)

In droplet-based single-cell and single-nucleus RNA-seq assays, systematic contamination of ambient RNA molecules biases the quantification of gene expression levels. Existing methods correct the contamination for all genes globally. However, there lacks specific evaluation of correction efficacy for varying contamination levels. Here, we show that DecontX and CellBender under-correct highly contaminating genes, while SoupX and scAR over-correct lowly/non-contaminating genes. Here, we develop scCDC as the first method to detect the contamination-causing genes and only correct expression levels of these genes, some of which are cell-type markers. Compared with existing decontamination methods, scCDC excels in decontaminating highly contaminating genes while avoiding over-correction of other genes.

A genome-wide spectrum of tandem repeat expansions in 338,963 humans

(2024)

The Genome Aggregation Database (gnomAD), widely recognized as the gold-standard reference map of human genetic variation, has largely overlooked tandem repeat (TR) expansions, despite the fact that TRs constitute ∼6% of our genome and are linked to over 50 human diseases. Here, we introduce the TR-gnomAD (https://wlcb.oit.uci.edu/TRgnomAD), a biobank-scale reference of 0.86 million TRs derived from 338,963 whole-genome sequencing (WGS) samples of diverse ancestries (39.5% non-European samples). TR-gnomAD offers critical insights into ancestry-specific disease prevalence using disparities in TR unit number frequencies among ancestries. Moreover, TR-gnomAD is able to differentiate between common, presumably benign TR expansions, which are prevalent in TR-gnomAD, from those potentially pathogenic TR expansions, which are found more frequently in disease groups than within TR-gnomAD. Together, TR-gnomAD is an invaluable resource for researchers and physicians to interpret TR expansions in individuals with genetic diseases.