NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 7 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Burkhardt, Amy; Lottridge, Susan; Woolf, Sherri – Educational Measurement: Issues and Practice, 2021
For some students, standardized tests serve as a conduit to disclose sensitive issues of harm or distress that may otherwise go unreported. By detecting this writing, known as "crisis papers," testing programs have a unique opportunity to assist in mitigating the risk of harm to these students. The use of machine learning to…
Descriptors: Scoring Rubrics, Identification, At Risk Students, Standardized Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Yangmeng Xu; Stefanie A. Wind – Educational Measurement: Issues and Practice, 2025
Double-scoring constructed-response items is a common but costly practice in mixed-format assessments. This study explored the impacts of Targeted Double-Scoring (TDS) and random double-scoring procedures on the quality of psychometric outcomes, including student achievement estimates, person fit, and student classifications under various…
Descriptors: Academic Achievement, Psychometrics, Scoring, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Zesch, Torsten; Horbach, Andrea; Zehner, Fabian – Educational Measurement: Issues and Practice, 2023
In this article, we systematize the factors influencing performance and feasibility of automatic content scoring methods for short text responses. We argue that performance (i.e., how well an automatic system agrees with human judgments) mainly depends on the linguistic variance seen in the responses and that this variance is indirectly influenced…
Descriptors: Influences, Academic Achievement, Feasibility Studies, Automation
Peer reviewed Peer reviewed
Direct linkDirect link
Firoozi, Tahereh; Mohammadi, Hamid; Gierl, Mark J. – Educational Measurement: Issues and Practice, 2023
Research on Automated Essay Scoring has become increasing important because it serves as a method for evaluating students' written responses at scale. Scalable methods for scoring written responses are needed as students migrate to online learning environments resulting in the need to evaluate large numbers of written-response assessments. The…
Descriptors: Active Learning, Automation, Scoring, Essays
Peer reviewed Peer reviewed
Direct linkDirect link
Lewis, Jennifer; Lim, Hwanggyu; Padellaro, Frank; Sireci, Stephen G.; Zenisky, April L. – Educational Measurement: Issues and Practice, 2022
Setting cut scores on (MSTs) is difficult, particularly when the test spans several grade levels, and the selection of items from MST panels must reflect the operational test specifications. In this study, we describe, illustrate, and evaluate three methods for mapping panelists' Angoff ratings into cut scores on the scale underlying an MST. The…
Descriptors: Cutting Scores, Adaptive Testing, Test Items, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Baldwin, Peter – Educational Measurement: Issues and Practice, 2021
In the Bookmark standard-setting procedure, panelists are instructed to consider what examinees know rather than what they might attain by guessing; however, because examinees sometimes do guess, the procedure includes a correction for guessing. Like other corrections for guessing, the Bookmark's correction assumes that examinees either know the…
Descriptors: Guessing (Tests), Student Evaluation, Evaluation Methods, Standard Setting (Scoring)
Peer reviewed Peer reviewed
Direct linkDirect link
Xiong, Jiawei; Li, Feiming – Educational Measurement: Issues and Practice, 2023
Multidimensional scoring evaluates each constructed-response answer from more than one rating dimension and/or trait such as lexicon, organization, and supporting ideas instead of only one holistic score, to help students distinguish between various dimensions of writing quality. In this work, we present a bilevel learning model for combining two…
Descriptors: Scoring, Models, Task Analysis, Learning Processes