Bayesian Covariance Modeling for Longitudinal Zero-Inflated Count Data
Skip to main content
Open Access Publications from the University of California


UCLA Electronic Theses and Dissertations bannerUCLA

Bayesian Covariance Modeling for Longitudinal Zero-Inflated Count Data


We develop models for longitudinal count data with a large number of zeros, a feature known as zero-inflation. Familiar distributions for modeling count data (Poisson, binomial, negative binomial) often do not account for the observed frequency of zeros. Further, in longitudinal data, the same subjects are repeatedly measured over time inducing correlation between sets of measurements on the same individual. Modeling of longitudinal data that does not account for this correlation can give rise to misleading inferences. This dissertation develops three classes of models for longitudinal count data: (i) a Bayesian longitudinal hurdle model for data with prespecified measurement times, (ii) a Bayesian longitudinal hurdle model for data with varying measurement times, and (iii) a multivariate longitudinal zero-inflated Poisson model. Approach (i) is an analysis of the number of days of heaving drinking in a study of screening, brief intervention, and referral to treatment (SBIRT), and approaches (ii) and (iii) are motivated by analyses of the Linking Inmates to Care (LINK LA) study. Building on two-part models that predict non-zero versus zero outcomes while incorporating assumptions about the distribution of non-zero outcomes, the newly developed methods use mixed-effect modeling strategies to account for irregular measurement times and correlated patterns in count data beyond those reflected in random intercept models. The superiority of the proposed methods over random intercept models is established using goodness-of-fit metrics that consider the number of model parameters, and the appeal of modeling multiple count outcomes simultaneously is reflected in Bayesian credible intervals that point to non-zero correlations among the respective count outcomes. We build upon previous longitudinal zero-inflated and hurdle models by introducing time varying random effects in the count models with random effects distributed a priori as multivariate normal with a parameterized covariance matrix. We propose several covariance models, which improve fit over random intercept models in both the SBIRT and LINK LA data. We introduce latent time varying main and random effects to allow count rates and zero probabilities to change with time since intervention and include exposure offsets to account for varying times over which counts are recorded. Finally, for use with multivariate data, we propose a multivariate longitudinal zero-inflated Poisson model for observations with varying exposure, which we use to simultaneously model three different kinds of doctor visits recorded in the LINK LA study.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View