Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 4 |
Descriptor
Interrater Reliability | 4 |
Evaluation Methods | 2 |
Goodness of Fit | 2 |
Identification | 2 |
Scoring | 2 |
Academic Achievement | 1 |
At Risk Students | 1 |
Classification | 1 |
Coding | 1 |
Comparative Analysis | 1 |
Decision Making | 1 |
More ▼ |
Source
Educational Measurement:… | 4 |
Author
Burkhardt, Amy | 1 |
Lottridge, Susan | 1 |
Solano-Flores, Guillermo | 1 |
Stefanie A. Wind | 1 |
Walker, A. Adrienne | 1 |
Wind, Stefanie A. | 1 |
Woolf, Sherri | 1 |
Yangmeng Xu | 1 |
Publication Type
Journal Articles | 4 |
Reports - Research | 3 |
Reports - Evaluative | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Yangmeng Xu; Stefanie A. Wind – Educational Measurement: Issues and Practice, 2025
Double-scoring constructed-response items is a common but costly practice in mixed-format assessments. This study explored the impacts of Targeted Double-Scoring (TDS) and random double-scoring procedures on the quality of psychometric outcomes, including student achievement estimates, person fit, and student classifications under various…
Descriptors: Academic Achievement, Psychometrics, Scoring, Evaluation Methods
Burkhardt, Amy; Lottridge, Susan; Woolf, Sherri – Educational Measurement: Issues and Practice, 2021
For some students, standardized tests serve as a conduit to disclose sensitive issues of harm or distress that may otherwise go unreported. By detecting this writing, known as "crisis papers," testing programs have a unique opportunity to assist in mitigating the risk of harm to these students. The use of machine learning to…
Descriptors: Scoring Rubrics, Identification, At Risk Students, Standardized Tests
Solano-Flores, Guillermo – Educational Measurement: Issues and Practice, 2021
This article proposes a Boolean approach to representing and analyzing interobserver agreement in dichotomous coding. Building on the notion that observations are samples of a universe of observations, it submits that coding can be viewed as a process in which observers sample pieces of evidence on constructs. It distinguishes between formal and…
Descriptors: Online Searching, Coding, Interrater Reliability, Evidence
Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021
Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…
Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making