Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 4 |
Descriptor
Goodness of Fit | 4 |
Simulation | 4 |
Item Response Theory | 3 |
Nonparametric Statistics | 2 |
Sample Size | 2 |
Test Length | 2 |
Administrator Surveys | 1 |
Comparative Analysis | 1 |
Computation | 1 |
Error Patterns | 1 |
Evaluation Methods | 1 |
More ▼ |
Source
Applied Measurement in… | 4 |
Author
Bolt, Daniel M. | 1 |
Chunyan Liu | 1 |
Raja Subhiyah | 1 |
Richard A. Feinberg | 1 |
Rutkowski, Leslie | 1 |
Sinharay, Sandip | 1 |
Svetina, Dubravka | 1 |
Wells, Craig S. | 1 |
Publication Type
Journal Articles | 4 |
Reports - Research | 4 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Chunyan Liu; Raja Subhiyah; Richard A. Feinberg – Applied Measurement in Education, 2024
Mixed-format tests that include both multiple-choice (MC) and constructed-response (CR) items have become widely used in many large-scale assessments. When an item response theory (IRT) model is used to score a mixed-format test, the unidimensionality assumption may be violated if the CR items measure a different construct from that measured by MC…
Descriptors: Test Format, Response Style (Tests), Multiple Choice Tests, Item Response Theory
Rutkowski, Leslie; Svetina, Dubravka – Applied Measurement in Education, 2017
In spite of the challenges inherent in making dozens of comparisons across heterogeneous populations, a relatively recent interest in scale-score equivalence for non-achievement measures in an international context has emerged. Until recently, operational procedures for establishing measurement invariance using multiple-groups analyses were…
Descriptors: International Assessment, Goodness of Fit, Statistical Analysis, Teacher Surveys
Sinharay, Sandip – Applied Measurement in Education, 2017
Karabatsos compared the power of 36 person-fit statistics using receiver operating characteristics curves and found the "H[superscript T]" statistic to be the most powerful in identifying aberrant examinees. He found three statistics, "C", "MCI", and "U3", to be the next most powerful. These four statistics,…
Descriptors: Nonparametric Statistics, Goodness of Fit, Simulation, Comparative Analysis
Wells, Craig S.; Bolt, Daniel M. – Applied Measurement in Education, 2008
Tests of model misfit are often performed to validate the use of a particular model in item response theory. Douglas and Cohen (2001) introduced a general nonparametric approach for detecting misfit under the two-parameter logistic model. However, the statistical properties of their approach, and empirical comparisons to other methods, have not…
Descriptors: Test Length, Test Items, Monte Carlo Methods, Nonparametric Statistics