Estimating the Nominal Response Model Under Nonnormal Conditions

The nominal response model (NRM), a much understudied polytomous item response theory (IRT) model, provides researchers the unique opportunity to evaluate within-item category distinctions. Polytomous IRT models, such as the NRM, are frequently applied to psychological assessments representing constructs that are unlikely to be normally distributed in the population. Unfortunately, models estimated using estimation software with the MML/EM algorithm frequently employs a set of normal quadrature points, effectively ignoring the true shape of the latent trait distribution. To address this problem, the current research implements an alternative estimation approach, Ramsay Curve Item Response Theory (RC-IRT), to provide more accurate item parameter estimates modeled under the NRM under normal, skewed, and bimodal latent trait distributions for ordered polytomous items. Based on the results of improved item parameter recovery under RC-IRT, it is recommended that RC-IRT estimation be implemented whenever a researcher considers the construct being measured has the potential of being nonnormally distributed.

assumes that (a) the examinees are independent; (b) item responses are independent, conditional on the latent trait, u; and (c) the probability distribution of the population of examinees must be specified prior to estimation of the item parameters (Bock & Aitkin, 1981;Bock & Lieberman, 1970). For this third assumption, any shape of the latent trait distribution can be specified; however, current computer software overwhelmingly implements a normally distributed set of quadrature points, thus creating a normal latent trait distribution. Unfortunately, the true latent trait distribution in the population is unobservable, and frequently studied psychological constructs such as depression and pain may be particularly unlikely to have a normal latent trait distribution.
Furthermore, many psychological constructs fall along a continuum, which are arguably better assessed by Likert-type response formats over dichotomous scoring, with response options such as strongly disagree to strongly agree, not at all like me to very much like me, or never occurs to always occurs. Polytomous IRT models allow estimation of multipoint Likert-type response formats and allow scale constructors to conduct a more informed analysis of polytomous items. Additionally, scale constructors have the opportunity to evaluate the functioning of each withinitem category distinction by using a largely understudied polytomous IRT model, the nominal response model (NRM; Bock, 1972Bock, , 1997. Fortunately, recent developments have been made in IRT estimation procedures, such as Ramsay curve item response theory (RC-IRT; Woods & Thissen, 2006), promising accurate item parameter recovery. This estimation methodology is implemented in RCLOG (Woods & Thissen, 2006) and EQSIRT (Multivariate Software, 2010); however, only EQSIRT allows RC-IRT estimation under the NRM. Further advancements in user-accessible IRT software, such as EQSIRT, permit easy and accurate IRT model estimation of nonnormal data. However, it remains undetermined whether analyzing nonnormal data under the NRM with EQSIRT using RC-IRT estimation actually produces more accurate item parameter estimates. It is, therefore, the purpose of this study to evaluate the recovery of category boundary discrimination (CBD) parameters using RC-IRT as estimated using EQSIRT for ordered categorical data under the NRM.
De Ayala and Sava-Bolesta (1999) manipulated a number of factors to investigate the relationship between the ratio of the sample size to the number of parameters to be estimated, latent trait distribution, maximum item information, and item parameter estimation with the NRM. Results of their simulations showed that manipulating the information and the sample size to parameter ratio had similar, but inverse, effects on both the CBD and intersection parameter recovery. Specifically, bias in item parameters increased as the maximum item information increased and as the sample size to parameter ratio decreased. Manipulating the distribution of the latent trait (i.e., skew and kurtosis) also affected the estimation accuracy of CBD parameters. They recommended increasing the sample size as one means of increasing the number examinees responding to an unattractive option as determined by the degree of nonnormality, and therefore increasing the accuracy of the item parameter estimates.
DeMars (2003) expanded on the research conducted de Ayala and Sava-Bolesta (1999) by evaluating item parameter recovery in the NRM under variations in ratio of total number of item parameters to sample size as determined by test length, number of response options, and sample size. They also manipulated the shape of the latent trait distribution and average within-item category boundary discrimination. Error variance of the item parameter estimates increased with increases in the number of response options and average within-item category discrimination. Additionally, biases in item parameters were identified when the shape of the latent trait distribution was skewed. DeMars (2003) suggests that the ratio of sample size to the number of response categories, as opposed to the ratio of sample size to total number of item parameters, is the greatest determinant of item parameter estimation accuracy.

Nominal Response Model
The NRM has the unique ability to evaluate the functioning of each within-item category distinction, termed CBD parameters. The size of a CBD indicates the amount of relative information provided by adjacent response categories (e.g., the degree to which a response in Category 3 vs. Category 2 differentiates among people on the latent trait). Also estimated under the NRM are intersection parameters, which identify the point on the latent trait where an individual is likely to respond in adjacent categories. It follows that the NRM is considered the most general divide-by-total polytomous model, because this model allows for the most flexibility in parameter estimation, and all other divide-by-total models are constrained versions of the NRM ( de Ayala, 2009;Ostini & Nering, 2006;Thissen & Steinberg, 1986). Preston and Reise (2013) illustrate several potentially useful applications of the NRM including exploring whether (a) CBD parameters vary within an item, (b) an item contains too many response options, and (c) response options are ordered.
The NRM can be used to model completely nominal items, multiple choice items, partially ordered items, and completely ordered items. In the NRM, the conditional probability of an individual with trait level u responding in category x (x = 0, . . . , m i ) on item i can be written as where, for identification, Sa ix = Sc ix = 0 (or, in some cases, that the parameters for the lowest response category a i1 = c i1 = 0). This constraint forces one response option to have a monotonically increasing CRC (the one with the most positive a), and one response option to have a monotonically decreasing CRC (the one with the lowest a).
The category slope, a iz , should not be confused with the CBD parameter, a * ix , which determines the amount of relative information provided by a response in category x versus responding in adjacent category x2 1. The category intercept parameters, c ix , in Equation 1 are not intersection parameters, rather, they reflect the relative popularity of the response option. Specifically, the intersection parameters, c * j , identify the point on the latent trait where an individual is equally likely to respond in adjacent categories.

Ramsay Curve IRT
RC-IRT (Woods & Thissen, 2006) is a new method for fitting IRT models for which the latent trait distribution is assumed differentiable and strictly positive, but not necessarily normal. This approach combines MML/EM item parameter estimation (Bock & Aitkin, 1981;Bock & Lieberman, 1970) with a spline-based density approximation procedure described by Ramsay (2000). Specifically, in RC-IRT the distribution of the latent trait is estimated, instead of the default normal distribution as in traditional in MML. In the E step of the EM algorithm (Bock & Aitkin, 1981), the total number of people at each quadrature point is estimated using the current characterization of the latent trait distribution. The normal distribution is used to start. In the M step, after the likelihoods for each item are maximized, the likelihood of the current characterization of latent trait distribution is maximized as approximated as a Ramsay curve using basis spline (B-spline) functions (de Boor, 2001). The shape of the Ramsay curve is determined by a combination of B-splines through the specification of the order of each B-spline polynomial function joined together at a specified number of knots. As implemented in EQSIRT, the order of the polynomial, termed degree, and the number of knots must be chosen by the computer user, and they determine the number of B-splines, m, used during latent trait estimation where d is the degree of the polynomial function producing each spline (e.g., d = 2 produces splines with one bend; d = 3 produces splines with two bends; and d = 4 produces splines with three bends), and a knot is the point on the latent trait where separate B-splines are joined together. Woods and Thissen (2006) determined that larger order and/or more than two knots produce increasingly nonnormal Ramsay curves.

Demonstration
Here, a brief example is provided to demonstrate the severity of NRM parameter estimation bias caused by nonnormality when analyzed with MML, implementing a normal prior distribution, and the improvement in item parameter recovery when RC-IRT is implemented. For this demonstration, responses to ten 4-category items were simulated for 1,000 simulees. The CBD parameters were simulated as highly discriminating, 1.5, or moderately discriminating, 0.75, and were constrained to equality within an item. The intersection parameters were simulated as symmetric and fixed to 21.0, 0, and 1.0 across conditions. Figure 1 displays the true shape of the latent trait distribution that was simulated as normal, skewed (skew = 1.75, kurtosis = 6.75), and bimodal (described in detail below). Each of the six conditions was replicated 1,000 times. As seen in Table 1, the CBD parameters, highly discriminating and moderately discriminating, are recovered with excellent accuracy when the true shape of the latent trait distribution is normal. However, as the shape of the distribution deviates further from normality, the CBD parameter estimates become increasingly downwardly biased. This bias is more pronounced in items with more highly discriminating CBD parameters. The estimated highly discriminating CBD parameters under the normal, skew, and bimodal conditions are also depicted graphically in Figure 2A, B, and C, respectively. Each plot in Figure 2 displays the 95% confidence interval around the CBD estimates for each estimated CBD parameter per item. Specifically, three CBD parameters were estimated for each 4-category item, so each item is represented by three tick-marks along the x-axis. A horizontal line is drawn across each plot to represent  the true CBD parameter. Within each plot, the sampling distribution of the estimated CBD parameters for each item is displayed as a vertical 95% confidence band with the average estimated CBD parameter for the item plotted as a dot along the confidence bar. Visual inspection of this plot makes it obvious that nonnormality greatly affects item parameter recovery under the NRM. In Figure 2A, the average estimated CBD parameters are plotted very close to the true CBD parameter and the sampling distribution is quite narrow. However, as the distribution of the latent trait increases in skew and kurtosis, the sampling distribution of the recovered CBD parameters become more variable and the average estimated CBD parameters drift further from the simulated value. Specifically, a pattern manifested under both the skew and bimodal distribution conditions such that the first CBD parameter for each item is underestimated, the second CBD parameter is fairly accurately estimated, and the third CBD parameter is consistently overestimated. Unfortunately, since the CBD parameters were constrained to equality within the item, when averaged the estimates were averaged; the parameters appear as though they are accurately estimated. This pattern, which was more pronounced in the skewed distribution condition, is expected considering that when the distribution of the latent trait is incorrect, the expectation of the probability distribution for each of these categories is wrong.
In an effort to estimate CBD parameters more accurately, the high skew condition with highly discriminating CBD parameters was estimated under the NRM again, but using the RC-IRT estimation algorithm with the EQSIRT program defaults, order of 3 and 4 knots. The recovered CBD parameters and 95% confidence band, displayed in Figure 2D, are more accurately estimated with the average estimated CBD parameter of 1.373 (.209), average bias of the CBD parameters only 2.127, and RMSE = .273. Using RC-IRT to account for and model the nonnormality of the distribution provides a marked improvement over imposing a normal prior distribution with MML estimation. Specifically, the overestimation of the third CBD parameter is greatly reduced, but the first CBD parameter is still considerably underestimated. The results of this demonstration support the importance of evaluating the recovery of CBD parameters using RC-IRT as estimated using EQSIRT for ordered categorical data under the NRM.

Method
Design Manipulated variables were determined by prior research suggesting factors that influence item parameter recovery, producing 144 simulation conditions, which include the following: (a) within-item CBD variation; (b) sample size; (c) distribution of the latent trait, u; (d) order of the polynomial; and (e) number of knots. All conditions were simulated with 10 items containing 4 response options each, and CBD parameters were manipulated consistently across all conditions. The distances between intersection parameters were manipulated according to the distribution of u, as described below. The conditions with a normally distributed latent trait, u, estimated with order of 2 and 2 knots (Woods, 2006) are considered the null conditions because it is the same as assuming a normal distribution and all results will be compared with those conditions. Factors Influencing Item Parameter Recovery CBD Parameters. Adapting Preston and Reise (2013), the average size of the CBD parameter was set to 1.00, indicating moderate discrimination. To create items with variation in the CBD parameters, values were drawn randomly from a uniform distribution. The minimum and maximum values of the distribution originated at 1.00, a moderate discrimination, which constrained the within-item CBD parameters to equality creating data under the generalized partial credit model. The maximum value of the distribution increased by .25 and the minimum values of the distribution decreased by .25, creating variation in within item CBD parameter that increased in increments of .5 for each condition. The range of the distribution increased by increments of .5 until the minimum value of the distribution reached .25.
Intersection parameters. DeMars (2003) showed the distance between the intersection parameters influences item parameter recovery under the NRM. Taking into consideration how the shape of the latent trait distribution partly determines the response distribution, the category intersection parameters were generated based on the shape of the latent trait. Specifically, under the normal and bimodal distributions intersections were determined by the 20th, 50th, and 70th percentiles of the simulated distribution. However, it is reasonable to assume that, under a skewed distribution, the responses would follow the nonnormality of the latent trait distribution. Therefore, for the skewed distribution conditions, the intersections were determined by the 50th, 80th, and 95th percentiles of the simulated distribution.
Sample size. De Ayala and Sava-Bolesta (1999) and DeMars (2003) consistently found that the sample size to response option ratio is an important factor in the accuracy of item parameter recovery in the nominal response model. Therefore, two sample size conditions, N = 500 and N = 2,000, were considered, because sample size is well known to affect the accuracy of but not necessarily bias item parameter estimation (de Ayala & Sava-Bolesta, 1999;DeMars, 2003). The smaller sample size (N = 500) represents the general number of participants frequently included in a psychological study, and the larger sample size (N = 2,000) is representative of larger sets of individuals responding to educational tests.
Latent trait distribution. Three latent trait distributions were considered because the focus of the present research is to determine the degree to which the latent trait distribution affects item parameter recovery in the NRM (see Figure 1). The shape of the normal and skewed distributions was determined by Fleishman's (1978) power method weights. In this article, Fleishman presents a table of power method weights, (a, b, c, and d) that can correspond to particular skew and kurtosis of the desired distribution to be used in simulating nonnormal distributions. These values are applied to the polynomial transformation equation Y = a + bX + cX 2 + dX 3 . Figure 1A and B displays the shape of the true latent trait distribution, u, as standard normal (skew = 0.0, kurtosis = 3.0, a = 0.0, b = 1.0, c = 0.0, d = 0.0), and skewed (skew = 1.75, kurtosis = 6.75, a = 20.39949667453766, b = 0.92966052480111, c = 0.39949667453766, d = 20.03646699281275). The highly skewed condition uses coefficients that are within the range of the latent distribution estimated by Woods and Thissen (2006) for dichotomous items related to panic disorder with skew of 1.04 and kurtosis of 6.53. Fleishman's (1978) power method weights does not address bimodal distributions, so the bimodal distribution was determined by a mixture of normal distributions (where m = mean, s = standard deviation, and mp = mixing proportion). Figure 1C displays the shape of the true latent trait distribution, u, as bimodal (m 1 = 21.5, m 2 = 3.0, s 1 = 0.7, s 2 = 1.5, mp 1 = 1.0, mp 2 = 0.7). The bimodal latent trait distribution represents a scenario typically seen in health outcomes research (e.g., Hanger, Fogarty, Wilkinson, & Sainsbury, 2000 ), where there is a large group of individuals who are considered ''normal'' and score low, and there is a small group of ''patients'' who score high on the latent trait but are normally distributed within that group. The distribution was standardized (m = 0.0, s = 1.0) prior to data simulation per direction from Sam He (personal communication, November 17, 2010).
Order of the polynomial and knots. Since the implementation of the NRM with EQSIRT using RC-IRT has never been evaluated, this research will manipulate predetermined combinations of polynomial order and knots to evaluate the consistency and accuracy of the CBD parameter estimates. In EQSIRT, RC-IRT is adapted as a type of MML/EM parameter estimation method. This implementation borrows the idea of de Boor's B-spline curve fitting algorithm, which requires two pieces of information to normalize the curve to the population density: (a) order of the polynomial and (b) number of knots. The degree, as termed in EQSIRT, is the order of the polynomial of the B-splines curve that is computed during the parameter estimation. The knots are the number of joints of the B-splines curve. These joints are presented in the form of parameters that are estimated simultaneously during parameter estimation. As implemented in EQSIRT, the user may choose either order 2 or 3 for the order of the polynomial and between 2 and 10 knots. Because the order of the polynomial and knots are related to the shape of the latent trait distribution, which influences the recovery of item parameters, data were estimated with order 2 and 3, exhausting all EQSIRT implemented options. Data were further estimated with 2, the minimum option in EQSIRT; 4, the default option in EQSIRT; or 10 knots, the maximum option in EQSIRT.

Data Generation
For each condition, 1,000 data sets were generated. All data sets were generated under the NRM and item parameters were estimated under the NRM via RC-IRT as implemented in EQSIRT (Multivariate Software, 2010).

Model Fitting
The newly developed EQSIRT (Multivariate Software, 2010) was used for all conditions estimated under RC-IRT. Each data set was fitted with the NRM using mostly the program defaults, except the maximum number of EM cycles was set to 999. The output of EQSIRT contains a and c parameters as calculated in Equation 1; therefore, the a and c parameters are converted into the CBD parameters and category intersections using the formulas described above. As mentioned previously, estimation of RC-IRT in EQSIRT requires the user to specify the order of the polynomial and the number of knots or joints for the b-splines curve. The default number of order = 3 (range = 2-3), and knots = 4 (range = 2-10).

Outcome Measure
This study focuses exclusively on the evaluation of CBD parameter recovery, which is considered the more useful parameter. Additionally, the intersection parameters are a function of the CBD parameters, so the recovery of the intersections was not explicitly evaluated. CBD parameter recovery was evaluated in several ways (de Ayala & Sava-Bolesta, 1999;Woods, 2006): (a) mean and standard deviation of the recovered parameters, (b) average absolute bias, and (c) a computed index of the absolute difference between the true and estimated test characteristic curves (TCCs). First, each item's characteristic curve (ICC) was computed based on CRC and ICC relationships discussed in Embretson and Reise (2000), the CRC computed for each of the four categories per item for Equation 1 is multiplied by 0, 1, 2, and 3, respectively, creating an ICC. Finally, the ICCs within a test were summed over to create a TCC for each test. The expected value of the true and estimated TCC was evaluated at 60 points evenly spaced between 23 and 3. The difference index was computed by averaging over the sum of the absolute differences between the true and estimated expected values at each of the 60 points.

Results
The results of the simulation study are presented in Tables 2 through 6, where each  table is devoted to presenting results for all conditions as summarized by one of the outcome measures. The first three columns of each of these tables list order of the polynomial, number of knots, and range of CBD, respectively. The remaining columns list the outcome measure under the normal distribution, skewed distribution, and bimodal distribution for each sample size condition (500, 2,000), respectively. Each outcome measure is discussed separately. The first four rows of each table provide results for conditions estimated using order 2 and 2 knots, specifying a normal distribution under RC-IRT estimation, which produces results equivalent to those obtained without implementing RC-IRT estimation. Therefore, under the normal distribution conditions, these first four rows provide information about the accuracy of item parameter recovery when the estimated latent trait distribution matches the true latent trait distribution, the null condition, producing ideal outcome measures. The remaining rows of each table under the normal distribution demonstrate the consequence to CBD parameter estimation when using RC-IRT estimation unnecessarily. However, under the skewed and bimodal distributions, these first four rows produce results when the shape of the latent trait distribution is assumed normal, demonstrating the effect of ignoring nonnormality on the outcome measures. The remaining rows of each table under the skewed and bimodal latent trait distributions demonstrate the improvement in CBD parameter and u score recovery when the shape of the latent trait distribution is estimated using RC-IRT. Table 2 contains the means and standard deviations of the recovered CBD parameters. As expected, the standard deviations of the CBD parameters increased as the range of the CBD parameters increased. Specifically, when CBD parameters were all constrained to 1.0, producing the generalized partial credit model (Muraki, 1992), the standard deviation averaged over all distribution, sample size, order of the polynomial, and number of knots conditions was 0.170, which increased as the CBD parameter variation increased to SD = 0.231 with CBD parameters ranging from 0.75 to 1.25, to SD = 0.360 with CBD parameters ranging from 0.50 to 1.50, and to SD = 0.539 with CBD parameters ranging from 0.25 to 1.75. The standard deviations of   Table 3 displays the proportion of converged replications for each condition, which were generally excellent. Proportion of replications converged was high over all conditions with 0.90 being the minimum number of converged replications occurring under the skewed distribution estimated using order of 2 and 10 knots with CBD parameters varying from 0.50 to 1.50. Overall, the conditions estimated with 10 knots consistently resulted in the lowest proportion converged, which was magnified under the small sample size condition (N = 500).

CBD Parameter Recovery
Absolute bias. Table 4 displays the absolute bias in CBD parameter recovery for each condition, with values near zero indicating no bias. As expected, parameters are less biased under the larger sample size condition (mean bias = 20.001) than the small sample size condition (mean bias = 0.039). The CBD parameters were recovered accurately under the normal distribution, the null condition, with order of 2 and 2 knots with an average bias of 0.013; however, the bias is reduced even further to 0.006 when estimated with order of 3 and 2 knots. Ignoring nonnormality under the bimodal distribution produces considerable positive bias in the CBD estimates (mean bias = 0.064), which is even more pronounced under the skewed distribution (mean bias = 0.114). Focusing on the skewed distribution, the average bias in CBD parameter estimates is greatly reduced to 0.007 when estimated using order of 3 and 4 knots, and even further reduced to 0.006 with order of 2 and 4 knots averaging over sample size and CBD variation. RC-IRT estimation using order of 2 and 3 with 10 knots reduced the bias to 2.042 and 2.041, respectively, but the parameter estimates became downwardly biased. The bias in the CBD parameter estimates is smallest for the bimodal distribution condition when estimated using order of 2 and 4 knots. RC-IRT estimation using order of 2 and 3 with 10 knots reduced the bias when compared with ignoring the shape of the distribution to 20.042 and 20.041, but as with the normal distribution, the parameter estimates became downwardly biased. Across nonnormal distribution conditions, RC-IRT estimation with order of 2 or 3 and 4 knots produced the most accurate CBD parameter estimates in terms of absolute bias. Overall, estimation with either order of 2 or 3 and 10 knots produced CBD parameter estimates that were slightly downwardly biased (overall mean bias = 20.018), indicating that estimating distributions with 10 knots may overcharacterize the distribution, focusing on unimportant nuances, when the distributions are a more conventional, yet nonnormal, shape. The size of the absolute bias did not change as a function of CBD range.
In evaluating the absolute bias values averaged over all CBD values, some misestimation may be missed for the most extreme CBD values. Therefore, to further examine potential misestimation in extreme CBD values, the true CBD parameters and the corresponding estimates from the condition with the most widely varying CBD values  Table 5 presents these low, moderate, and high absolute bias values. Overall, the estimation of the CBD parameters is best for the low range at 0.004, and worsens as the category becomes more discriminating to 0.016 for moderate, and 0.050 for high CBD parameters. These findings are consistent with those from the demonstration study, which compared moderate CBD values of 0.75 to high CBD values of 1.5, and found inaccuracies in estimation were more pronounced for the high CBD values. As mentioned above, RC-IRT estimation using order of 2 and 4 knots produced the most accurate estimates as measured by absolute bias for the skewed. The optimal RC-IRT combination of polynomial order and knots appears to be mainly determined by the accuracy of the estimation of the high CBD values because the corresponding absolute bias value was at the minimum of 0.077 the small sample size condition, and 20.022 for the large sample size condition. These findings suggest that the more highly discriminating category distinctions are more important in determining optimal RC-IRT polynomial order and knots combination for estimation.
TCC difference index. Table 6 presents the absolute difference between the true and estimated TCCs. For this difference index, values close to zero indicate the estimated and true TCCs overlap entirely. Overall, CBD variation did not affect the magnitude of the difference index. However, the differences in sample size condition were quite pronounced, with the small sample size condition producing an average difference index value of .0419 and the large sample size condition producing a value of 0.289.
Figure3A, 3B, and 3C shows the true TCC for the normal, skewed, and bimodal distribution conditions, respectively, and the TCC estimated under RC-IRT using order of 2 and 2 knots, effectively ignoring the true shape of the latent trait distribution. Figure 3B and C also displays the TCC estimated under RC-IRT using the optimal order and knots as determined by the TCC difference index, which is order of 3 and 2 knots for Figure 3B and order of 3 and 10 knots for Figure 3C. The solid line represents the true TCC, the dashed line represents the estimated TCC under order of 2 and 2 knots, and the dotted line represents the estimated TCC under the optimal order and knots. As can be seen in Figure 3A, the true and estimated TCCs overlap almost entirely, which indicates excellent parameter recovery at the test level. Consequently, under the normal distribution condition, the TCC difference index was at the minimum value for all conditions of 0.221 averaged over sample size conditions, indicating that true item parameters and estimated item parameters did not differ in difficulty when estimated using a normal distribution. The TCC difference index remained consistently small, but inflated slightly when estimated using any other combination of order and knots, indicating that overparameterizing the distribution worsens the accuracy of the model at the test level.
As mentioned above, Figure 3B displays the true, estimated ignoring nonnormality, and optimal RC-IRT recovered TCCs under the skewed distribution. As can be seen, the TCC was underestimated at the negative extreme end of u and was overestimated at the positive extreme end of u. Accordingly, the TCC difference index was extremely large at 0.990, averaged over sample size and CBD variation conditions. Using RC-IRT with any combination of order and knots to account for the nonnormality of the distribution improved the accuracy of the estimated TCC, but RC-IRT estimation with order of 3 and 2 knots reduced the TCC difference index the most to 0.309. Visual inspection of the Figure 3B shows that the true TCC and the TCC estimated under RC-IRT using order of 3 and 2 knots are nearly indistinguishable. Figure 3C displays the true, estimated ignoring nonnormality, and optimal RC-IRT recovered TCCs under the bimodal distribution. Similar to the skewed Figure 3. True, estimated, and recovered test characteristic curves (TCCs) averaged over sample size and CBD variation conditions under the (A) normal distribution condition, (B) skewed distribution condition, and (C) bimodal distribution condition. True TCC is represented by the solid line, the estimated TCC using order of 2 and 2 knots is represented by the dashed line, and the optimum polynomial order and knot combination recovered TCC is represented by the dotted line. distribution condition, the TCC was underestimated at the negative extreme end of u and was overestimated at the positive extreme end of u. However, in the bimodal distribution condition, the over-and underestimation was noticeably smaller, resulting in moderate TCC difference index values of 0.455. As in the skewed distribution condition, estimation with any combination of order and knots under RC-IRT reduced the TCC difference index, but estimation with order of 2 and 4 knots produced the most accurate TCCs with a TCC difference index value of 0.241. Figure 4A to F displays the true versus recovered TCCs as estimated under RC-IRT using all combinations of order and knots for the skewed distribution condition. The true TCC is represented by the solid line and the recovered TCC is represented by the dashed line. Comparing Figure 4A to Figure 4B through F illustrates that merely estimating under RC-IRT is more critical to accurate item parameters than determining the appropriate number of order and knots. As can be seen in Figure 4A, estimation of a test, while ignoring the nonnormality, results in under-and overestimation of the item parameters, especially at the extremes of the trait level. In Figure  4B through F, the TCCs overlap almost entirely, regardless of combination of order and knots, showing how important it is to use RC-IRT estimation when the measured construct is nonnormally distributed in the population. However, careful visual inspection of Figure 4E and F supports the general finding that estimation using 10 knots decreases the accuracy of the item parameters slightly, but are still considerably more accurate than ignoring that nonnormality in the distribution.

Discussion
Polytomous IRT models are becoming increasing popular as a method of constructing and evaluating educational and psychological measurement instruments. In its increasing popularity, polytomous IRT models are being used to study psychological and educational constructs that are likely not normally distributed in the population. Moreover, these assessments are being used to make high-stakes decisions about individuals, such as a clinical diagnosis or whether or not a child qualifies for special education. Unfortunately, the inherent nonnormality of these constructs is frequently ignored during estimation using the MML/EM estimation algorithm, potentially leading to an incorrect diagnosis or incorrectly placing a child in a special education classroom.
This research provided a comprehensive demonstration of the effects of ignoring nonnormality in the distribution of the latent trait under the NRM. The NRM was chosen for this demonstration because it has the unique ability to evaluate response category functioning, allowing researchers to conduct in-depth item analyses to identify items containing too many response options, unordered response options, or poorly functioning categories that do not provide useful psychometric information.
To determine the effects of nonnormality on CBD parameter estimation and latent trait distribution recovery, data were generated under normal, skewed, and bimodal distributions. These data were estimated under RC-IRT in EQSIRT using order of 2 and 2 knots, effectively ignoring the shape of the true latent trait distribution. Results showed that CBD parameters were grossly inaccurate and upwardly biased leading to large differences between estimated and true item parameters. Generally, consequences of ignoring the nonnormality of the latent trait distribution were greater under a skewed distribution, and slightly less so under a bimodal distribution. Inaccuracies of CBD parameter estimates were magnified under a small sample size. Overall, the consequences of ignoring nonnormality in the CBD parameter estimates were severe.
Fortunately, recent developments in IRT estimation procedures, such as RC-IRT (Woods & Thissen, 2006), and user-friendly software, such as EQSIRT (Multivariate Software, 2010), improves the estimation of CBD parameters. The primary focus of this research was to conduct an extensive evaluation of item parameter recovery using EQSIRT's implementation of RC-IRT for ordered categorical data under the NRM. To study the improvement of estimation when accounting for the shape of the latent trait distribution, the normal, skewed, and bimodal distributions were estimated using a variety of combinations of polynomials order and knots. Generally, CBD parameter recovery was greatly improved when the nonnormality of the distributions was estimated using any of the studied combinations of order and knots. Specifically, using either 2 or 4 knots improved CBD recovery the most when compared with estimation using 10 knots. These results indicate that, under RC-IRT, estimating more parameters than necessary to account for the nonnormality in the latent trait distribution actually worsens the accuracy of the estimation. Under a skewed distribution, generally the most accurate CBD parameter estimates were obtained using order of 2 and 4 knots, whereas CBD parameters under the bimodal distribution were most accurately recovered when estimated using order of 3 and 4 knots. These findings are consistent with Woods (2006), that knowledge about the shape of the distribution can enhance estimation precision in the recovery of the CBD parameter estimates. The current results demonstrate consistently improved CBD parameter accuracy under the NRM while using predetermined combinations of polynomial order and knots; therefore, future research will use a model selection algorithm to determine the optimal combination of polynomial order and knots for each generated dataset and to improve the external validity of the research. Additionally, future research will evaluate the influence of complimentary factors known to affect the accuracy of item parameter recovery such as the number of items, the number of response options, and additional nonnormal latent trait distributions.
Based on these results, it is recommended that RC-IRT estimation be implemented whenever a researcher considers the construct being measured has the potential of being nonnormally distributed in the studied population. As demonstrated, ignoring the true shape of the latent trait distribution creates gross inaccuracies in the estimation of the item parameters. Moreover, the benefits to implementing RC-IRT estimation in the accuracy of item parameters are great, and there is very little consequence to implementing RC-IRT estimation when the latent trait is actually normally distributed. Finally, it is recommended that unless theory suggests otherwise, researchers should use combinations of order of 2 or 3 and 2 or 4 knots in applications because these combinations resulted in the most consistently accurate estimates.