Case Diagnostics in Categorical Factor Analysis
- Author(s): Mansolf, Maxwell Armand
- Advisor(s): Reise, Steven P
- et al.
Case diagnostics in categorical factor analysis include Mahalanobis distance-based statistics, which measure residual and leverage, and adaptations of existing influence diagnostics such as individual contribution to chi-square and generalized Cook’s distance which measure each case’s influence on statistical results. This dissertation uses two simulation studies to explore issues related to the use of case diagnostics in categorical factor analysis in order to assess the feasibility and utility of an iteratively reweighted least squares estimator for categorical factor analysis and structural equation modeling. In the first simulation, I used large data sets simulated according to a hypothesized model structure to examine the null distributions of Mahalanobis distance-based measures of residual and leverage in categorical factor analysis. Specifically, this study examined the validity of statistical cut-off values derived from continuous distributions in categorical factor analysis and assessed the differences between theoretical and empirical critical values in these models. In most conditions, the distributions of leverage and residual diagnostics in polytomous data, and of leverage diagnostics in dichotomous data, were similar enough to those in continuous data that existing critical values can safely be used to identify high-leverage cases. In contrast, residual diagnostics in dichotomous data had severely truncated distributions, a result which complicates the choice of critical value for identifying high-residual cases in residual analysis or down-weighting cases in robust estimation. In the second simulation, I examined the relationships between leverage, residual, and influence in categorical and continuous factor analysis and compared those relationships across continuous, polytomous, and dichotomous test conditions. Results were largely consistent between continuous and polytomous data but differed markedly in dichotomous data with high variability across dichotomous test conditions. Together, these findings reveal that, while categorical case diagnostics are well-behaved in polytomous tests under ideal conditions, these diagnostics can behave unpredictably in dichotomous data, and thus caution should be used in interpreting their values directly in dichotomous tests, whether as a means for screening for outliers or for down-weighting cases in robust estimation.