When Quality Matters: Linking the Reliability of Demographic and Health Survey Data to Biases in International Mortality, Fertility, and Family Planning Estimates
In countries without reliable vital registration systems – the majority of low- and middle-income countries – most vital statistical estimates are based on nationally representative household survey data. Such surveys are usually implemented under the USAID-funded Demographic and Health Surveys (DHS) project. Because DHS data are so widely used, the quality of these data is paramount to enable countries to monitor their population growth and health and track progress towards international development goals. This dissertation aims to provide a careful, detailed interrogation of DHS data quality in the areas of fertility, child mortality, and contraceptive use.
The first chapter examines linkages between questionnaire length and data quality. I analyze 238 DHS Surveys to ascertain whether changes in the DHS survey instrument – predominantly increases in length and complexity of the core questionnaire over time – have led to poorer data quality and thus biased fertility and child mortality rates. I explain the likely causes and consequences of one measure of data quality: birth displacement, disaggregated by child survival status. I examine differences in displacement by DHS survey characteristics, including the average number of non-missing variables per woman interviewed in each survey (a proxy measure of questionnaire length) and modules including HIV biomarker testing. Results indicate substantial birth displacement in the majority of DHS surveys, and disproportionate displacement of dead children compared to surviving children. Increases in birth displacement, and differential displacement of deceased children, are associated with increases in questionnaire length. This differential displacement likely biases recent estimates of infant and under-five mortality rates downward which in turn overestimates recent declines in these indicators.
The second chapter focuses on the quality data acquired through one section of the DHS questionnaire: the reproductive calendar, in which women are asked to recall their births, pregnancies, terminations, and episodes of contraceptive use for the last 5-7 years. I compare retrospective contraceptive prevalence rates (CPR) tabulated from the calendar to independently estimated current status CPR from a prior survey for the same point in time among women in the same age groups. The chapter compares estimates of the total CPR as well as the prevalence of each specific contraceptive method for 106 pairs of surveys conducted in 37 countries. I find that calendar data appear to underestimate contraceptive use in most comparisons, often substantially. Total contraceptive prevalence is reported at statistically significantly different levels in 74 percent of survey pairs analyzed. The average difference in CPR was 4.1 percentage points, resulting in an average discrepancy of 15 percent between the current use CPR and that estimated from retrospective calendar data for the same point in time.
The third chapter builds on the findings from Chapter 2, using the comparisons between retrospective calendar data and current status data and other data quality indicators, to select 16 surveys in which reproductive calendar data appear to be reliable. Contraceptive use data from these 16 countries were pooled together for a sample of 140,529 episodes of contraceptive use collected from 97,094 women’s reproductive histories. I use this pooled dataset to estimate cumulative 12-month contraceptive failure rates for each of the most widely-used contraceptive methods. Correlates of contraceptive failure are examined using multilevel survival models. I find that contraceptive failure rates are generally higher when calculated from surveys with reliable data compared to median estimates across all DHS surveys, suggesting that surveys with unreliable calendars underestimate contraceptive failure rates. Contraceptive failure rates vary widely by age, with adolescent women experiencing the highest failure rates. Failure also appears associated with socio-economic status, suggesting that the youngest and poorest women are at highest risk of experiencing unintended pregnancy.