ERIC - Search Results

Publication Date

In 2025

Source

ACT Education Corp.	1
Annenberg Institute for…	1
British Educational Research…	1
Oxford Review of Education	1

Author

Benjamin W. Domingue	1
James G. Soland	1
Jeff Allen	1
Jonas Flodén	1
Joshua B. Gilbert	1
Louise Badham	1
Ty Cruce	1

Publication Type

Reports - Research	4
Journal Articles	2

Education Level

Higher Education	2
Postsecondary Education	2
High Schools	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment

What Works Clearinghouse Rating

Showing all 4 results Save | Export

Grading Exams Using Large Language Models: A Comparison between Human and AI Grading of Exams in Higher Education Using ChatGPT

Peer reviewed

Direct link

Jonas Flodén – British Educational Research Journal, 2025

This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…

Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring

The Sensitivity of Value-Added Estimates to Test Scoring Decisions. EdWorkingPaper No. 25-1226

Download full text

Joshua B. Gilbert; James G. Soland; Benjamin W. Domingue – Annenberg Institute for School Reform at Brown University, 2025

Value-Added Models (VAMs) are both common and controversial in education policy and accountability research. While the sensitivity of VAMs to model specification and covariate selection is well documented, the extent to which test scoring methods (e.g., mean scores vs. IRT-based scores) may affect VA estimates is less studied. We examine the…

Descriptors: Value Added Models, Tests, Testing, Scoring

Statistically Guided Grading Judgements: Contextualisation or Contamination?

Peer reviewed

Direct link

Louise Badham – Oxford Review of Education, 2025

Different sources of assessment evidence are reviewed during International Baccalaureate (IB) grade awarding to convert marks into grades and ensure fair results for students. Qualitative and quantitative evidence are analysed to determine grade boundaries, with statistical evidence weighed against examiner judgement and teachers' feedback on…

Descriptors: Advanced Placement Programs, Grading, Interrater Reliability, Evaluative Thinking

Initial Evidence Supporting Interpretations of Scores from the Enhanced ACT Test. ACT Research. Research Report. R2425

Download full text

Jeff Allen; Ty Cruce – ACT Education Corp., 2025

This report summarizes some of the evidence supporting interpretations of scores from the enhanced ACT, focusing on reliability, concurrent validity, predictive validity, and score comparability. The authors argue that the evidence presented in this report supports the interpretation of scores from the enhanced ACT as measures of high school…

Descriptors: College Entrance Examinations, Testing, Change, Scores

Error of Measurement	4
Scoring	4
Grading	2
Interrater Reliability	2
Scores	2
Test Reliability	2
Testing	2
Advanced Placement Programs	1
Artificial Intelligence	1
Bias	1
Change	1
College Admission	1
College Entrance Examinations	1
College Faculty	1
College Readiness	1
Comparative Analysis	1
Comparative Testing	1
Computation	1
Computer Assisted Testing	1
Context Effect	1
Decision Making	1
Ethics	1
Evaluation Criteria	1
Evaluation Methods	1
Evaluative Thinking	1
More ▼