NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 4 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sean Joo; Montserrat Valdivia; Dubravka Svetina Valdivia; Leslie Rutkowski – Journal of Educational and Behavioral Statistics, 2024
Evaluating scale comparability in international large-scale assessments depends on measurement invariance (MI). The root mean square deviation (RMSD) is a standard method for establishing MI in several programs, such as the Programme for International Student Assessment and the Programme for the International Assessment of Adult Competencies.…
Descriptors: International Assessment, Monte Carlo Methods, Statistical Studies, Error of Measurement
Jinjin Huang – ProQuest LLC, 2020
Measurement invariance is crucial for an effective and valid measure of a construct. Invariance holds when the latent trait varies consistently across subgroups; in other words, the mean differences among subgroups are only due to true latent ability differences. Differential item functioning (DIF) occurs when measurement invariance is violated.…
Descriptors: Robustness (Statistics), Item Response Theory, Test Items, Item Analysis
Cai, Li; Monroe, Scott – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2014
We propose a new limited-information goodness of fit test statistic C[subscript 2] for ordinal IRT models. The construction of the new statistic lies formally between the M[subscript 2] statistic of Maydeu-Olivares and Joe (2006), which utilizes first and second order marginal probabilities, and the M*[subscript 2] statistic of Cai and Hansen…
Descriptors: Item Response Theory, Models, Goodness of Fit, Probability
Morrison, Carol A.; Fitzpatrick, Steven J. – 1992
An attempt was made to determine which item response theory (IRT) equating method results in the least amount of equating error or "scale drift" when equating scores across one or more test forms. An internal anchor test design was employed with five different test forms, each consisting of 30 items, 10 in common with the base test and 5…
Descriptors: Comparative Analysis, Computer Simulation, Equated Scores, Error of Measurement