ERIC - Search Results

Publication Date

In 2025	2
Since 2024	2
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	3

Descriptor

Comparative Testing	3
Error of Measurement	3
Evaluation Methods	2
Interrater Reliability	2
Undergraduate Students	2
Academic Standards	1
Artificial Intelligence	1
College Faculty	1
Computer Assisted Testing	1
Equivalency Tests	1
Ethics	1
Evaluation Criteria	1
Evidence Based Practice	1
Foreign Countries	1
Grading	1
Medical Students	1
Reliability	1
Sample Size	1
Scoring	1
Simulation	1
Standardized Tests	1
Statistical Analysis	1
Statistical Bias	1
Test Reliability	1
Test Validity	1
More ▼

Source

Advances in Physiology…	1
British Educational Research…	1
Practical Assessment,…	1

Author

Jonas Flodén	1
Lovato, Chris Y.	1
Ole J. Kemi	1
Rusticus, Shayna A.	1

Publication Type

Journal Articles	3
Reports - Research	2
Reports - Evaluative	1

Education Level

Higher Education	3
Postsecondary Education	3

Audience

Location

Canada

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 3 results Save | Export

Grading Exams Using Large Language Models: A Comparison between Human and AI Grading of Exams in Higher Education Using ChatGPT

Peer reviewed

Direct link

Jonas Flodén – British Educational Research Journal, 2025

This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…

Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring

Evidence-Based Evaluation of Student and Marker Performances in Assessment and Examination

Peer reviewed

Direct link

Ole J. Kemi – Advances in Physiology Education, 2025

Students are assessed by coursework and/or exams, all of which are marked by assessors (markers). Student and marker performances are then subject to end-of-session board of examiner handling and analysis. This occurs annually and is the basis for evaluating students but also the wider learning and teaching efficiency of an academic institution.…

Descriptors: Undergraduate Students, Evaluation Methods, Evaluation Criteria, Academic Standards

Impact of Sample Size and Variability on the Power and Type I Error Rates of Equivalence Tests: A Simulation Study

Peer reviewed
PDF on ERIC

Download full text

Rusticus, Shayna A.; Lovato, Chris Y. – Practical Assessment, Research & Evaluation, 2014

The question of equivalence between two or more groups is frequently of interest to many applied researchers. Equivalence testing is a statistical method designed to provide evidence that groups are comparable by demonstrating that the mean differences found between groups are small enough that they are considered practically unimportant. Few…

Descriptors: Sample Size, Equivalency Tests, Simulation, Error of Measurement