ERIC - Search Results

Publication Date

In 2025	2
Since 2024	4
Since 2021 (last 5 years)	9

Descriptor

Automation	9
Test Reliability	9
Scoring	4
Artificial Intelligence	3
Computer Assisted Testing	3
Test Validity	3
English (Second Language)	2
Formative Evaluation	2
Language Tests	2
Models	2
Predictor Variables	2
Psychometrics	2
Scores	2
Second Language Learning	2
Test Construction	2
Test Items	2
Writing Tests	2
Affordances	1
Attribution Theory	1
Behavioral Sciences	1
Cancer	1
Cheating	1
Cognitive Ability	1
College Entrance Examinations	1
College Mathematics	1
More ▼

Source

Journal of Computer Assisted…	2
Assessment in Education:…	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
Journal of Computing in…	1
Journal of Educational…	1
SAGE Open	1

Publication Type

Journal Articles	9
Reports - Research	6
Information Analyses	1
Reports - Descriptive	1
Reports - Evaluative	1

Education Level

Higher Education	3
Postsecondary Education	3

Audience

Location

Japan

Laws, Policies, & Programs

Assessments and Surveys

International English…	1
Test of English as a Foreign…	1
Test of English for…	1

What Works Clearinghouse Rating

Showing all 9 results Save | Export

A Review of Automatic Item Generation Techniques Leveraging Large Language Models

Peer reviewed
PDF on ERIC

Download full text

Bin Tan; Nour Armoush; Elisabetta Mazzullo; Okan Bulut; Mark J. Gierl – International Journal of Assessment Tools in Education, 2025

This study reviews existing research on the use of large language models (LLMs) for automatic item generation (AIG). We performed a comprehensive literature search across seven research databases, selected studies based on predefined criteria, and summarized 60 relevant studies that employed LLMs in the AIG process. We identified the most commonly…

Descriptors: Artificial Intelligence, Test Items, Automation, Test Format

How Can Valid and Reliable Automatic Formative Assessment Predict the Acquisition of Learning Outcomes?

Peer reviewed

Direct link

Blaženka Divjak; Barbi Svetec; Damir Horvat – Journal of Computer Assisted Learning, 2024

Background: Sound learning design should be based on the constructive alignment of intended learning outcomes (LOs), teaching and learning activities and formative and summative assessment. Assessment validity strongly relies on its alignment with LOs. Valid and reliable formative assessment can be analysed as a predictor of students' academic…

Descriptors: Automation, Formative Evaluation, Test Validity, Test Reliability

Automatic Modelling of Perceptual Judges in the Context of Head and Neck Cancer Speech Intelligibility

Peer reviewed

Direct link

Sebastião Quintas; Mathieu Balaguer; Julie Mauclair; Virginie Woisard; Julien Pinquier – International Journal of Language & Communication Disorders, 2024

Background: Perceptual measures such as speech intelligibility are known to be biased, variant and subjective, to which an automatic approach has been seen as a more reliable alternative. On the other hand, automatic approaches tend to lack explainability, an aspect that can prevent the widespread usage of these technologies clinically. Aims: In…

Descriptors: Speech Communication, Cancer, Human Body, Intelligibility

Evaluating the Consistency and Reliability of Attribution Methods in Automated Short Answer Grading (ASAG) Systems: Toward an Explainable Scoring System

Peer reviewed

Direct link

Wallace N. Pinto Jr.; Jinnie Shin – Journal of Educational Measurement, 2025

In recent years, the application of explainability techniques to automated essay scoring and automated short-answer grading (ASAG) models, particularly those based on transformer architectures, has gained significant attention. However, the reliability and consistency of these techniques remain underexplored. This study systematically investigates…

Descriptors: Automation, Grading, Computer Assisted Testing, Scoring

Development of a New Measure of Cognitive Ability Using Automatic Item Generation and Its Psychometric Properties

Peer reviewed

Direct link

Ryoo, Ji Hoon; Park, Sunhee; Suh, Hongwook; Choi, Jaehwa; Kwon, Jongkyum – SAGE Open, 2022

In the development of cognitive science understanding human intelligence and mind, measurement of cognitive ability has played a key role. To address the development in data scientific point of views related to cognitive neuroscience, there has been a demand of creating a measurement to capture cognition in short and repeated time periods. This…

Descriptors: Cognitive Ability, Psychometrics, Test Validity, Test Construction

A Novel Automated Essay Scoring Approach for Reliable Higher Educational Assessments

Peer reviewed

Direct link

Beseiso, Majdi; Alzubi, Omar A.; Rashaideh, Hasan – Journal of Computing in Higher Education, 2021

E-learning is gradually gaining prominence in higher education, with universities enlarging provision and more students getting enrolled. The effectiveness of automated essay scoring (AES) is thus holding a strong appeal to universities for managing an increasing learning interest and reducing costs associated with human raters. The growth in…

Descriptors: Automation, Scoring, Essays, Writing Tests

Digital-First Assessments: A Security Framework

Peer reviewed

Direct link

LaFlair, Geoffrey T.; Langenfeld, Thomas; Baig, Basim; Horie, André Kenji; Attali, Yigal; von Davier, Alina A. – Journal of Computer Assisted Learning, 2022

Background: Digital-first assessments leverage the affordances of technology in all elements of the assessment process--from design and development to score reporting and evaluation to create test taker-centric assessments. Objectives: The goal of this paper is to describe the engineering, machine learning, and psychometric processes and…

Descriptors: Computer Assisted Testing, Affordances, Scoring, Engineering

Complementary Strengths? Evaluation of a Hybrid Human-Machine Scoring Approach for a Test of Oral Academic English

Peer reviewed

Direct link

Davis, Larry; Papageorgiou, Spiros – Assessment in Education: Principles, Policy & Practice, 2021

Human raters and machine scoring systems potentially have complementary strengths in evaluating language ability; specifically, it has been suggested that automated systems might be used to make consistent measurements of specific linguistic phenomena, whilst humans evaluate more global aspects of performance. We report on an empirical study that…

Descriptors: Scoring, English for Academic Purposes, Oral English, Speech Tests

Developing an Online Test to Measure Writing and Speaking Skills Automatically

Peer reviewed

Direct link

Bateson, Gordon – International Journal of Computer-Assisted Language Learning and Teaching, 2021

As a result of the Japanese Ministry of Education's recent edict that students' written and spoken English should be assessed in university entrance exams, there is an urgent need for tools to help teachers and students prepare for these exams. Although some commercial tools already exist, they are generally expensive and inflexible. To address…

Descriptors: Test Construction, Computer Assisted Testing, Internet, Writing Tests

Alzubi, Omar A.	1
Attali, Yigal	1
Baig, Basim	1
Barbi Svetec	1
Bateson, Gordon	1
Beseiso, Majdi	1
Bin Tan	1
Blaženka Divjak	1
Choi, Jaehwa	1
Damir Horvat	1
Davis, Larry	1
Elisabetta Mazzullo	1
Horie, André Kenji	1
Jinnie Shin	1
Julie Mauclair	1
Julien Pinquier	1
Kwon, Jongkyum	1
LaFlair, Geoffrey T.	1
Langenfeld, Thomas	1
Mark J. Gierl	1
Mathieu Balaguer	1
Nour Armoush	1
Okan Bulut	1
Papageorgiou, Spiros	1
Park, Sunhee	1
More ▼