The coefficient of variation (CV) measures variability relative to the mean, and can be useful when increases in the mean correspond with systematic increases in variability. The univariate CV has been studied and applied extensively, but until recently it was not possible to study the structure of relative covariation occurring in multivariate data. However, Boik and Shirvani (2009) demonstrated that relative covariation could be modeled using estimators they developed to describe the sampling distribution of the CV matrix. This matrix, denoted Ψ, is defined as
Ψ=D_μ^(-1) Σ D_μ^(-1),
where D_μ is diagonal matrix containing variable means and Σ is the covariance matrix. The present research builds on this previous work by considering a more general class of structure models of the CV matrix.
Specifically, we investigated how structural equation models of the CV matrix could be estimated and applied. First, a statistical theory for the estimation and evaluation of structural equation models of CV matrices was developed for both normally and arbitrarily distributed variables using generalized least squares. Computational algorithms were then written to implement the theory and to allow CV models to be estimated. Using these algorithms, a series of simulation studies were conducted to determine the quality of the estimators proposed by Boik and Shirvani (2009) and the quality of the subsequent model parameters, standard errors, and test statistics, which rely on those estimators. The simulations considered a range of sample sizes, normal and log-normal data, different numbers of variables, and models with either one or two factors. It was found that Boik and Shirvani's theoretical estimators of the variance of the sampling distribution of the Ψ converged very slowly to their expected values and that they were particularly unreliable for log-normal data. That said, the estimation methods relying on these estimators, were generally able to estimate factor loadings accurately across conditions and when the sample sizes were fairly large and the number of variables was small, they also produced reasonably accurate estimates of the variance parameters, standard errors and test-statistics. However, in small samples with large numbers of variables, the variance estimates and the model fit statistics tended to be too low and the standard errors were typically overly conservative. In addition, when the data were log-normal the model fit statistics were problematic regardless of whether the estimator relied on normal theory. This general pattern of results was observed in both one-factor and two-factor models. The discussions below address some possible explanations for the estimation problems noted here and propose future work that should be done to better understand and potentially correct these problems.
Some of this work was initiated here and included in a short series of follow-up studies. Specifically, we addressed the numerical stability of the CV matrix and its sampling distribution covariance in terms of condition numbers. It was found that these matrices were both typically less stable than their counterparts in structural covariance modeling. In addition, we observed that Winsorizing the data used for the estimation produced modest improvements in the numerical stability. It remains to be seen how this might affect model estimation.
Finally, a one-factor CV model was fit to a longitudinal dataset assessing alcohol use over four years. Although each estimation method seemed to be able to reproduce the sample CV matrix with some accuracy, the model fit statistics indicated the model should be rejected. Given the non-normal distributions of the variables in the model, the appropriate interpretation of these results is ambiguous, but the interpretation and implications are discussed.