Worldwide, particularly in areas with no treatment availability or antenatal programs, approximately 1600 children are diagnosed with human immunodeficiency virus (HIV) every day,(5) and over 300,000 deaths from acquired immune deficiency syndrom (AIDS) among children occur annually worldwide.(6) The Centers for Disease Prevention and Control (CDC) estimated a total of 142 children less than 13 years old were infected with HIV perinatally in 2005,(7) while the World Health Organization (WHO) estimates 2 million children (0-14 years) globally living with HIV (1.8 million living in Sub-Saharan Africa alone).(8) Epidemiologists and biostatisticians are actively trying to estimate the causal effects of highly active antiretroviral therapy (HAART) in order to establish which treatments are best and when to they should be initiated. This proves to be a challenging task for several reasons including the unique dynamics of pediatric HIV populations and the lack of randomized evidence. However, with an abundance of observational data, analytical approaches designed to help researchers establish causal effects from observational studies have been developed--referred to within the present studies as causal inference techniques. In this dissertation, I performed a systematic review of studies that used so-called causal inference methods (i.e. propensity scores, instrumental variables, marginal structural models, and structural equation models) in the context of HIV/AIDS research and assessed the interpretability and content of the identified studies. I use empirical examples from a marginal structural model (MSM) analysis and instrumental variable (IV) analysis using Pediatric Spectrum of Disease (PSD) surveillance program data. Specifically, I estimate the causal effect of triple therapy (e.g. HAART) on time to C diagnosis, and time to C diagnosis/death among HIV-infected children and perform an adapted instrumental variable analysis in order to estimate the causal effect of HAART on the hazard of AIDS events or death.
The systematic review revealed that approximately 43% of all studies using causal inference methods on HIV/AIDS data were published in 2007 and 2008. Studies using MSMs were less likely to discuss specific model selection than studies using any other causal inference method (OR=0.26; 95% CI 0.08-0.72). Using a g-comp approach, where I define Ψ1 (p0)(tk) ≡P(Ta > tk) as all treated and Ψ0 (p0)(tk) ≡P(Ta > tk) as all untreated, the causal effect of HAART suggested that among children who initiated therapy within 6 months of birth the effect in delaying a C diagnosis, ΨHZ(p0)(tk)= -0.466 (95% CI -1.20-0.565), is seemingly stronger than children who initiated therapy within 12 months of birth (ΨHZ(p0)(tk)= -0.321 (95% CI -1.151-0.300)). Additionally, though not statistically significant, the effect of triple therapy initiated within the first 12 months of life on time to C diagnosis is potentially greater among symptomatic children (12 Months: ΨHZsymptomatic (p0)(t36): -0.587 (95% CI -1.217-0.480)) than among asymptomatic children (12 Months: ΨHZasymptomatic (p0)(t36): -0.106 (95% CI -1.054-0.739)). The instrumental variable analysis yielded the naïve rate ratio comparing an early-defined IV (1997 cut-off) non-HAART era with the HAART era--estimated at RRITT=2.17 (95% CI: 1.34-3.52). As a result of HAART use misclassification by calendar era, an instrumental variable estimator was used, yielding a RRIV = 3.91 (95% CI 2.41, 6.34), 80% higher than the naïve result.
Regardless of year of publication, all HIV studies are deficient by varying degrees in all assessed areas. Researchers using causal inference methods should describe their methods in a more transparent and interpretable way so that the results may reach a wider audience. Together Chapter 3 and 4 use causal inference methods to not only help establish the effectiveness of HAART on preventing advanced disease and/or mortality, but they also attempt to address the need to establish optimal timing of treatment for treatment guidelines. The overarching benefits of these methods are that they define a parameter of interest not dependent on a particular model assumption (semi-parametric), and they define explicit identifiability assumptions under which these estimators produce estimates of so-called causal association, which are related to distributions of counterfactuals.