ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	7

Source

Educational Measurement:…

Publication Type

Journal Articles	7
Reports - Research	4
Reports - Evaluative	2
Reports - Descriptive	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 7 results Save | Export

A Rubric for the Detection of Students in Crisis

Peer reviewed

Direct link

Burkhardt, Amy; Lottridge, Susan; Woolf, Sherri – Educational Measurement: Issues and Practice, 2021

For some students, standardized tests serve as a conduit to disclose sensitive issues of harm or distress that may otherwise go unreported. By detecting this writing, known as "crisis papers," testing programs have a unique opportunity to assist in mitigating the risk of harm to these students. The use of machine learning to…

Descriptors: Scoring Rubrics, Identification, At Risk Students, Standardized Tests

Examining the Psychometric Impact of Targeted and Random Double-Scoring in Mixed-Format Assessments

Peer reviewed

Direct link

Yangmeng Xu; Stefanie A. Wind – Educational Measurement: Issues and Practice, 2025

Double-scoring constructed-response items is a common but costly practice in mixed-format assessments. This study explored the impacts of Targeted Double-Scoring (TDS) and random double-scoring procedures on the quality of psychometric outcomes, including student achievement estimates, person fit, and student classifications under various…

Descriptors: Academic Achievement, Psychometrics, Scoring, Evaluation Methods

To Score or Not to Score: Factors Influencing Performance and Feasibility of Automatic Content Scoring of Text Responses

Peer reviewed

Direct link

Zesch, Torsten; Horbach, Andrea; Zehner, Fabian – Educational Measurement: Issues and Practice, 2023

In this article, we systematize the factors influencing performance and feasibility of automatic content scoring methods for short text responses. We argue that performance (i.e., how well an automatic system agrees with human judgments) mainly depends on the linguistic variance seen in the responses and that this variance is indirectly influenced…

Descriptors: Influences, Academic Achievement, Feasibility Studies, Automation

Using Active Learning Methods to Strategically Select Essays for Automated Scoring

Peer reviewed

Direct link

Firoozi, Tahereh; Mohammadi, Hamid; Gierl, Mark J. – Educational Measurement: Issues and Practice, 2023

Research on Automated Essay Scoring has become increasing important because it serves as a method for evaluating students' written responses at scale. Scalable methods for scoring written responses are needed as students migrate to online learning environments resulting in the need to evaluate large numbers of written-response assessments. The…

Descriptors: Active Learning, Automation, Scoring, Essays

Setting and Validating Multiple Standards on a Multistage-Adaptive Test

Peer reviewed

Direct link

Lewis, Jennifer; Lim, Hwanggyu; Padellaro, Frank; Sireci, Stephen G.; Zenisky, April L. – Educational Measurement: Issues and Practice, 2022

Setting cut scores on (MSTs) is difficult, particularly when the test spans several grade levels, and the selection of items from MST panels must reflect the operational test specifications. In this study, we describe, illustrate, and evaluate three methods for mapping panelists' Angoff ratings into cut scores on the scale underlying an MST. The…

Descriptors: Cutting Scores, Adaptive Testing, Test Items, Item Analysis

A Problem with the Bookmark Procedure's Correction for Guessing

Peer reviewed

Direct link

Baldwin, Peter – Educational Measurement: Issues and Practice, 2021

In the Bookmark standard-setting procedure, panelists are instructed to consider what examinees know rather than what they might attain by guessing; however, because examinees sometimes do guess, the procedure includes a correction for guessing. Like other corrections for guessing, the Bookmark's correction assumes that examinees either know the…

Descriptors: Guessing (Tests), Student Evaluation, Evaluation Methods, Standard Setting (Scoring)

Bilevel Topic Model-Based Multitask Learning for Constructed-Responses Multidimensional Automated Scoring and Interpretation

Peer reviewed

Direct link

Xiong, Jiawei; Li, Feiming – Educational Measurement: Issues and Practice, 2023

Multidimensional scoring evaluates each constructed-response answer from more than one rating dimension and/or trait such as lexicon, organization, and supporting ideas instead of only one holistic score, to help students distinguish between various dimensions of writing quality. In this work, we present a bilevel learning model for combining two…

Descriptors: Scoring, Models, Task Analysis, Learning Processes

Scoring	5
Evaluation Methods	3
Academic Achievement	2
Automation	2
Cutting Scores	2
Interrater Reliability	2
Standard Setting (Scoring)	2
Active Learning	1
Adaptive Testing	1
At Risk Students	1
Attention	1
Classification	1
Computer Assisted Testing	1
Electronic Learning	1
Essays	1
Ethics	1
Feasibility Studies	1
Goodness of Fit	1
Guessing (Tests)	1
Holistic Approach	1
Identification	1
Individual Characteristics	1
Influences	1
Item Analysis	1
Language Skills	1
More ▼

Baldwin, Peter	1
Burkhardt, Amy	1
Firoozi, Tahereh	1
Gierl, Mark J.	1
Horbach, Andrea	1
Lewis, Jennifer	1
Li, Feiming	1
Lim, Hwanggyu	1
Lottridge, Susan	1
Mohammadi, Hamid	1
Padellaro, Frank	1
Sireci, Stephen G.	1
Stefanie A. Wind	1
Woolf, Sherri	1
Xiong, Jiawei	1
Yangmeng Xu	1
Zehner, Fabian	1
Zenisky, April L.	1
Zesch, Torsten	1
More ▼