The most basic approach to causal inference measures the response of a system or population to different exposures, or treatments, and compares one or more summaries of the responses. Thinking formally about causal inference requires that we consider what the potential outcomes would have been under a set of alternative exposures. Though it is only possible in practice to observe a single outcome for each unit, the principles of good experimental design, including random treatment assignment, allow a valid comparison of average responses to each exposure because effects from extraneous factors are minimized. In observational settings, where exposures or treatments arise "naturally", i.e., without experimental manipulation, a common strategy for estimating causal effects is to find units that are similar based upon a set of covariates, but receiving different exposures, and then compare their outcomes. This strategy is challenging if there are many covariates. Balancing scores, a low-dimensional summary of the relevant covariate space, can facilitate causal inference for observational data in settings with many covariates. Propensity scores which measure the probability of receiving a particular exposure or treatment are one example of a balancing score. To estimate treatment effects, balancing scores are used to to group individuals from different exposure groups to compare their response levels, or functions of balancing scores are used to re-weight the sample. This thesis explores novel methods for obtaining covariate balance in observational studies through the use of balancing scores and weighting methodology.

The dissertation begins by providing an overview of the potential outcome framework for causal inference in observational studies and required background knowledge for the methods developed. The first methodological contribution is the optimally balanced Gaussian process propensity score approach that applies a binary regression framework using Gaussian processes for estimating the propensity score. The hyperparameters of the process are selected to minimize a metric of covariate imbalance. The next methodological contribution is the development of targeted balancing weights for both binary and multi-treatment settings, where a covariate imbalance metric is created with respect to a covariate density of interest (this could be the distribution within the full population under study or within a specific subpopulation of interest) and unit weights are selected that minimize this metric, without an explicit assumption on the functional form of the weights. Each method is evaluated against competing methods from the causal inference literature through series of simulations and against a benchmark causal inference data set. The dissertation concludes with suggestions for future work. A contribution to measurement in observational studies is included as an appendix.