Causal Inference and Model Selection in Complex Settings
Propensity score methods have become a part of the standard toolkit for applied researchers who wish to ascertain causal effects from observational data. While they were originally developed for binary treatments, several researchers have proposed generalizations of the
propensity score methodology for non-binary treatment regimes. In this article, we firstly review three main methods that generalize propensity scores in this direction, namely, inverse propensity weighting (IPW), the propensity function (P-FUNCTION), and the generalized propensity score (GPS), along with recent extensions of the GPS that aim to improve its
robustness. We compare the assumptions, theoretical properties, and empirical performance of these methods. We propose three new methods that provide robust causal estimation based on the P-FUNCTION and GPS. While our proposed P-FUNCTION-based estimator preforms well, we generally advise caution in that all available methods can be biased by model misspecification and extrapolation. In a related line of research, we consider adjustment for posttreatment covariates in causal inference. Even in a randomized experiment, observations might have different compliance performance under treatment and control assignment. This posttreatment covariate cannot be adjusted using standard statistical methods. We review the principal stratification framework which allows for modeling this effect as part of its Bayesian hierarchical models. We generalize the current model to add the possibility of adjusting for
pretreatment covariates. We also propose a new estimator of the average treatment effect over the entire population.
In a third line of research, we discuss the spectral line detection problem in high energy astrophysics. We carefully review how this problem can be statistically formulated as a precise hypothesis test with point null hypothesis, why a usual likelihood ratio test does not apply for problem of this nature, and a doable fix to correctly quantify the p-value using the
likelihood ratio test statistic via posterior predictive p-values. However, as p-values (including posterior predictive p-values) tend to overstate the evidence for the alternative hypothesis for precise hypothesis testing, we review a Bayesian alternative method to do the line detection problem using the Bayes factor. Although Bayes factors are generally criticized
to be sensitive to the choice of prior distributions, we show that such prior dependence can reflect different scientific questions and thus be sensible. In fact, p-values have similar ``subjective influence'' in that testing for the existance of a line at a fixed location or in an area with broad range can lead to very different conclusions. This is usually known as the
look elsewhere effect in astrophysics.