Bayesian Estimation of Finite Population Quantities from Spatially Correlated Data under Ignorable and Nonignorable Survey Designs
- Author(s): Chan-Golston, Alec Michael
- Advisor(s): Banerjee, Sudipto
- et al.
Data which is geographically referenced has become increasingly common in many fields of study, such as public health, education, forestry, medicine, and agriculture. When data is sampled from a population, there is often knowledge pertaining to the units not sampled, such as a total count and simple demographics. This knowledge can be leveraged to estimate finite population quantities such as the population total or mean, using design or model-based estimators. However, it is unknown how these estimators perform in the presence of spatial correlation, that is, when the outcome sampled is assumed to be a partial-realization of a spatial process. This dissertation first presents an analysis predicting store patronage and fruit and vegetable expenditures during a corner store intervention using Bayesian spatial techniques and then presents a brief example of finite population estimation in an ignorable sampling setting. Next a general Bayesian framework is presented that accounts for both study design and spatial association. Under this, posterior samples of finite population quan- tities can be retrieved. This framework is first given under the assumption of an ignorable sampling design and is used to construct four models to account for two-stage designs with spatial dependence. These models are first applied to simulated data and then are used in an analysis of nitrate levels in California groundwater. We find that models accounting for both study design and spatial association perform best. This general framework is then extended to allow for a nonignorable sampling design, specifically to account for missing data patterns seen in reported annual household income in the corner store data. Through this, we are able to construct finite population estimates of the percent of income spent on fruits and vegetables. Such a framework provides a flexible way to account for spatial association and complex study designs in finite populations.