ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	2

Descriptor

Computer Software	3
Scoring	3
Accuracy	2
Computer Assisted Testing	2
Adaptive Testing	1
Artificial Intelligence	1
Classification	1
Computational Linguistics	1
Computer Games	1
Computer Software Reviews	1
Correlation	1
Elementary Secondary Education	1
Error Patterns	1
Essays	1
Evaluators	1
Item Response Theory	1
Microcomputers	1
Prediction	1
Test Construction	1
Test Interpretation	1
Validity	1
Writing Evaluation	1
Writing Skills	1
More ▼

Source

Journal of Educational…

Author

Alex J. Mechaber	1
Brian E. Clauser	1
Kai North	1
Le An Ha	1
Patience, Wayne	1
Peter Baldwin	1
Shermis, Mark D.	1
Victoria Yaneva	1
Yiyun Zhou	1

Publication Type

Journal Articles	3
Book/Product Reviews	1
Reports - Descriptive	1
Reports - Research	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 3 results Save | Export

The Vulnerability of AI-Based Scoring Systems to Gaming Strategies: A Case Study

Peer reviewed

Direct link

Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025

Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…

Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy

Anchoring Validity Evidence for Automated Essay Scoring

Peer reviewed

Direct link

Shermis, Mark D. – Journal of Educational Measurement, 2022

One of the challenges of discussing validity arguments for machine scoring of essays centers on the absence of a commonly held definition and theory of good writing. At best, the algorithms attempt to measure select attributes of writing and calibrate them against human ratings with the goal of accurate prediction of scores for new essays.…

Descriptors: Scoring, Essays, Validity, Writing Evaluation

MicroCAT Testing System Version 3.0.

Peer reviewed

Patience, Wayne – Journal of Educational Measurement, 1990

The four main subsystems of the MicroCAT Testing System for developing, administering, scoring, and analyzing computerized tests using conventional or item response theory methods are described. Judgments of three users of the system are included in the evaluation of this software. (SLD)

Descriptors: Adaptive Testing, Computer Assisted Testing, Computer Software, Computer Software Reviews