- Main
Hierarchical and semi-parametric Bayesian models for the study of longitudinal HIV behavior and tuberculosis incidence data
- Zhu, Yuda
- Advisor(s): Weiss, Robert E
Abstract
We propose and discuss two distinct and separate innovative Bayesian models. In the first model, we propose a replacement for standard statistical methodologies for longitudinal sexual behavior data. HIV intervention trials generally collect sexual behavior data repeatedly over time and involve multiple outcomes including the number of partners which are nested in subjects and the number of protected and unprotected sex acts with each partner which are inherently nested within partners. The data is further complicated by characteristics of both partners and acts. Partners can be HIV$^+$ or HIV$^-$ while sex acts can be protected or unprotected. Properly modeling these outcomes and distinguishing these characteristics is critical. Here we use a multilevel multivariate Bayesian model for modeling sexual behavior outcomes. The proposed model accounts for the full complexity of sexual behavior allowing for simultaneous modeling of the number of partners and the number of sex acts with each partner, differentiation of behavior by partner serostatus, accounting for study eligibility criterions associated with the outcome of interest, and correlations between observations with the same subject, observations with the same partner, and observations across time. We further show that the proposed model can be used to quantify and draw inference on seroadaptive behaviors. Seroadaptive behaviors describe behaviors that vary based on the HIV status of partners with the goal of reducing the risk of transmission. The model is used to analyze data from the Healthy Living Project.
In the second half of this thesis, we explore a novel extension to the Dirichlet process mixture (DPM) model to accommodate longitudinal data. Longitudinal data is characterized by two features. First, the data are a function of time implying dependence between sampling densities across time. Second, the \emph{same} subjects are repeatedly measured over time. The standard DPM model is a nonparametric Bayesian model that naturally clusters similar observations together and assigns a single value to each cluster. It can be used to model an unknown density but addresses neither of these two features in longitudinal data. A number of current extensions of the DPM model can accommodate dependent distributions which could be used to model the sampling distributions at each time point addressing the first feature. However, assumptions in these extensions imply these models do not take advantage of the second feature of longitudinal data where the \emph{same} subjects are followed over time. To account for both features, we propose the cluster memory Dirichlet process mixture (cmDPM) model extending the DPM model to properly accommodate longitudinal data. In the cmDPM model, subjects are modeled as a DPM model at baseline. Cluster assignments at future time points depend on where the subject was previously clustered. Each subject may retain their cluster from the previous time point with some nonzero probability. This implies that at later times, subjects are no longer exchangeable and their observed values depend on their previous clustering history. Clusters that are retained over time evolve through a time dependent process. The cmDPM model extends the DPM to use both the information of where the subject was previously clustered and the value assigned to that cluster to model subject data at the current time point.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-