This dissertation focuses on developing statistical models and methods to address data-analytic challenges in astrostatistics---a growing interdisciplinary field fostering collaborations between statisticians and astrophysicists. The astrostatistics projects we tackle can be divided into two main categories: modeling solar activity and Bayesian analysis of stellar evolution. These categories from Part I and Part II of this dissertation, respectively.
The first line of research we pursue involves classification and modeling of evolving solar features. Advances in space-based observatories are increasing both the quality and quantity of solar data, primarily in the form of high-resolution images. To analyze massive streams of solar image data, we develop a science-driven dimension reduction methodology to extract scientifically meaningful features from images. This methodology utilizes mathematical morphology to produce a concise numerical summary of the magnetic flux distribution in solar ``active regions'' that (i) is far easier to work with than the source images, (ii) encapsulates scientifically relevant information in a more informative manner than existing schemes (i.e., manual classification schemes), and (iii) is amenable to sophisticated statistical analyses.
In a related line of research, we perform a Bayesian analysis of the solar cycle using multiple proxy variables, such as sunspot numbers. We take advantage of patterns and correlations among the proxy variables to model solar activity using data from proxies that have become available more recently, while also taking advantage of the long history of observations of sunspot numbers. This model is an extension of the Yu et al. (2012) Bayesian hierarchical model for the solar cycle that used the sunspot numbers alone. Since proxies have different temporal coverage, we devise a multiple imputation scheme to account for missing data. We find that incorporating multiple proxies reveals important features of the solar cycle that are missed when the model is fit using only the sunspot numbers.
In Part II of this dissertation we focus on two related lines of research involving Bayesian analysis of stellar evolution. We first focus on modeling multiple stellar populations in star clusters. It has long been assumed that all star clusters are comprised of single stellar populations---stars that formed at roughly the same time from a common molecular cloud. However, recent studies have produced evidence that some clusters host multiple populations, which has far-reaching scientific implications. We develop a Bayesian hierarchical model for multiple-population star clusters, extending earlier statistical models of stellar evolution (e.g., van Dyk et al. 2009, Stein et al. 2013). We also devise an adaptive Markov chain Monte Carlo algorithm to explore the complex posterior distribution. We use numerical studies to demonstrate that our method can recover parameters of multiple-population clusters, and also show how model misspecification can be diagnosed. Our model and computational tools are incorporated into an open-source software suite known as BASE-9. We also explore statistical properties of the estimators and determine that the influence of the prior distribution does not diminish with larger sample sizes, leading to non-standard asymptotics.
In a final line of research, we present the first-ever attempt to estimate the carbon fraction of white dwarfs. This quantity has important implications for both astrophysics and fundamental nuclear physics, but is currently unknown. We use a numerical study to demonstrate that assuming an incorrect value for the carbon fraction leads to incorrect white-dwarf ages of star clusters. Finally, we present our attempt to estimate the carbon fraction of the white dwarfs in the well-studied star cluster 47 Tucanae.