Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 3 |
Descriptor
Error of Measurement | 4 |
Evaluators | 4 |
Item Response Theory | 2 |
Models | 2 |
Adults | 1 |
Answer Keys | 1 |
Bias | 1 |
Cutting Scores | 1 |
Data Analysis | 1 |
Decision Making | 1 |
Difficulty Level | 1 |
More ▼ |
Source
Journal of Educational… | 4 |
Author
Carl Westine | 1 |
Clauser, Brian E. | 1 |
Clauser, Jerome C. | 1 |
Kane, Michael | 1 |
Michelle Boyer | 1 |
Norcini, John J. | 1 |
Sebok-Syer, Stefanie S. | 1 |
Stella Y. Kim | 1 |
Tong Wu | 1 |
Wind, Stefanie A. | 1 |
Publication Type
Journal Articles | 4 |
Reports - Research | 3 |
Reports - Evaluative | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Tong Wu; Stella Y. Kim; Carl Westine; Michelle Boyer – Journal of Educational Measurement, 2025
While significant attention has been given to test equating to ensure score comparability, limited research has explored equating methods for rater-mediated assessments, where human raters inherently introduce error. If not properly addressed, these errors can undermine score interchangeability and test validity. This study proposes an equating…
Descriptors: Item Response Theory, Evaluators, Error of Measurement, Test Validity
Wind, Stefanie A.; Sebok-Syer, Stefanie S. – Journal of Educational Measurement, 2019
When practitioners use modern measurement models to evaluate rating quality, they commonly examine rater fit statistics that summarize how well each rater's ratings fit the expectations of the measurement model. Essentially, this approach involves examining the unexpected ratings that each misfitting rater assigned (i.e., carrying out analyses of…
Descriptors: Measurement, Models, Evaluators, Simulation
Clauser, Brian E.; Kane, Michael; Clauser, Jerome C. – Journal of Educational Measurement, 2020
An Angoff standard setting study generally yields judgments on a number of items by a number of judges (who may or may not be nested in panels). Variability associated with judges (and possibly panels) contributes error to the resulting cut score. The variability associated with items plays a more complicated role. To the extent that the mean item…
Descriptors: Cutting Scores, Generalization, Decision Making, Standard Setting

Norcini, John J. – Journal of Educational Measurement, 1987
Answer keys for physician and teacher licensing examinations were studied. The impact of variability on total errors of measurement was examined for answer keys constructed using the aggregate method. Results indicated that, in some cases, scorers contributed to a sizable reduction in measurement error. (Author/GDC)
Descriptors: Adults, Answer Keys, Error of Measurement, Evaluators