ERIC - Search Results

Publication Date

In 2025

Source

Annenberg Institute for…	1
International Electronic…	1
Journal of Education and…	1
Journal of Educational…	1
Language Testing	1

Author

Benjamin W. Domingue	1
Bhashithe Abeysinghe	1
Congning Ni	1
James G. Soland	1
Jinnie Shin	1
John Pill	1
Joshua B. Gilbert	1
Juanita Hicks	1
Makiko Kato	1
Rebecca Sickinger	1
Tineke Brunfaut	1
Wallace N. Pinto Jr.	1
More ▼

Publication Type

Reports - Research	5
Journal Articles	4
Tests/Questionnaires	1

Education Level

Secondary Education	2
Elementary Education	1
Grade 8	1
Higher Education	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1

Audience

Location

Austria	1
Japan	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…

What Works Clearinghouse Rating

Showing all 5 results Save | Export

The Sensitivity of Value-Added Estimates to Test Scoring Decisions. EdWorkingPaper No. 25-1226

Download full text

Joshua B. Gilbert; James G. Soland; Benjamin W. Domingue – Annenberg Institute for School Reform at Brown University, 2025

Value-Added Models (VAMs) are both common and controversial in education policy and accountability research. While the sensitivity of VAMs to model specification and covariate selection is well documented, the extent to which test scoring methods (e.g., mean scores vs. IRT-based scores) may affect VA estimates is less studied. We examine the…

Descriptors: Value Added Models, Tests, Testing, Scoring

Evaluating the Consistency and Reliability of Attribution Methods in Automated Short Answer Grading (ASAG) Systems: Toward an Explainable Scoring System

Peer reviewed

Direct link

Wallace N. Pinto Jr.; Jinnie Shin – Journal of Educational Measurement, 2025

In recent years, the application of explainability techniques to automated essay scoring and automated short-answer grading (ASAG) models, particularly those based on transformer architectures, has gained significant attention. However, the reliability and consistency of these techniques remain underexplored. This study systematically investigates…

Descriptors: Automation, Grading, Computer Assisted Testing, Scoring

Scoring Difficulty in Summary Writing Assessment: Toward the Reconstruction of Analytic Rubric

Peer reviewed
PDF on ERIC

Download full text

Makiko Kato – Journal of Education and Learning, 2025

This study aims to examine whether differences exist in the factors influencing the difficulty of scoring English summaries and determining scores based on the raters' attributes, and to collect candid opinions, considerations, and tentative suggestions for future improvements to the analytic rubric of summary writing for English learners. In this…

Descriptors: Writing Evaluation, Scoring, Writing Skills, English (Second Language)

Comparative Judgement for Evaluating Young Learners' EFL Writing Performances: Reliability and Teacher Perceptions of Holistic and Dimension-Based Judgements

Peer reviewed

Direct link

Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025

Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…

Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction

Decoding Student Insights: Analyzing Response Change in NAEP Mathematics Constructed Response Items

Peer reviewed
PDF on ERIC

Download full text

Congning Ni; Bhashithe Abeysinghe; Juanita Hicks – International Electronic Journal of Elementary Education, 2025

The National Assessment of Educational Progress (NAEP), often referred to as The Nation's Report Card, offers a window into the state of U.S. K-12 education system. Since 2017, NAEP has transitioned to digital assessments, opening new research opportunities that were previously impossible. Process data tracks students' interactions with the…

Descriptors: Reaction Time, Multiple Choice Tests, Behavior Change, National Competency Tests

Decision Making	5
Scoring	5
Scores	3
Computer Assisted Testing	2
English (Second Language)	2
Evaluation Methods	2
Foreign Countries	2
Second Language Learning	2
Test Reliability	2
Writing Evaluation	2
Academic Achievement	1
Attribution Theory	1
Automation	1
Behavior Change	1
College Faculty	1
College Students	1
Comparative Analysis	1
Computation	1
Difficulty Level	1
Error Correction	1
Error of Measurement	1
Evaluators	1
Experienced Teachers	1
Formative Evaluation	1
Grade 8	1
More ▼