ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	5

Descriptor

Error Patterns	5
Evaluators	5
Accuracy	3
Scoring	3
Comparative Analysis	2
Computer Assisted Testing	2
English (Second Language)	2
Evaluation Methods	2
Reliability	2
Validity	2
Writing Evaluation	2
Academic Language	1
Aphasia	1
Artificial Intelligence	1
Banking	1
Brain	1
Check Lists	1
Classification	1
Cognitive Ability	1
Computational Linguistics	1
Computer Games	1
Computer Software	1
Correlation	1
Crime	1
Databases	1
More ▼

Source

Cognitive Research:…	1
International Journal of…	1
Journal of Educational…	1
Journal of Speech, Language,…	1
Language Testing	1

Publication Type

Journal Articles	5
Reports - Research	5

Education Level

Higher Education	2
Postsecondary Education	2

Audience

Location

Turkey

Laws, Policies, & Programs

Assessments and Surveys

International English…	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing all 5 results Save | Export

The Vulnerability of AI-Based Scoring Systems to Gaming Strategies: A Case Study

Peer reviewed

Direct link

Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025

Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…

Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy

Statistical Feature Training Improves Fingerprint-Matching Accuracy in Novices and Professional Fingerprint Examiners

Peer reviewed

Direct link

Growns, Bethany; Towler, Alice; Dunn, James D.; Salerno, Jessica M.; Schweitzer, N. J.; Dror, Itiel E. – Cognitive Research: Principles and Implications, 2022

Forensic science practitioners compare visual evidence samples (e.g. fingerprints) and decide if they originate from the same person or different people (i.e. fingerprint 'matching'). These tasks are perceptually and cognitively complex--even practising professionals can make errors--and what limited research exists suggests that existing…

Descriptors: Crime, Evidence, Sampling, Statistics Education

Towards More Valid Scoring Criteria for Integrated Reading-Writing and Listening-Writing Summary Tasks

Peer reviewed

Direct link

Chan, Sathena; May, Lyn – Language Testing, 2023

Despite the increased use of integrated tasks in high-stakes academic writing assessment, research on rating criteria which reflect the unique construct of integrated summary writing skills is comparatively rare. Using a mixed-method approach of expert judgement, text analysis, and statistical analysis, this study examines writing features that…

Descriptors: Scoring, Writing Evaluation, Reading Tests, Listening Skills

Investigating the Impact of Rater Training on Rater Errors in the Process of Assessing Writing Skill

Peer reviewed
PDF on ERIC

Download full text

Sata, Mehmet; Karakaya, Ismail – International Journal of Assessment Tools in Education, 2022

In the process of measuring and assessing high-level cognitive skills, interference of rater errors in measurements brings about a constant concern and low objectivity. The main purpose of this study was to investigate the impact of rater training on rater errors in the process of assessing individual performance. The study was conducted with a…

Descriptors: Evaluators, Training, Comparative Analysis, Academic Language

Validation of an Automated Procedure for Calculating Core Lexicon from Transcripts

Peer reviewed

Direct link

Dalton, Sarah Grace; Stark, Brielle C.; Fromm, Davida; Apple, Kristen; MacWhinney, Brian; Rensch, Amanda; Rowedder, Madyson – Journal of Speech, Language, and Hearing Research, 2022

Purpose: The aim of this study was to advance the use of structured, monologic discourse analysis by validating an automated scoring procedure for core lexicon (CoreLex) using transcripts. Method: Forty-nine transcripts from persons with aphasia and 48 transcripts from persons with no brain injury were retrieved from the AphasiaBank database. Five…

Descriptors: Validity, Discourse Analysis, Databases, Scoring

Alex J. Mechaber	1
Apple, Kristen	1
Brian E. Clauser	1
Chan, Sathena	1
Dalton, Sarah Grace	1
Dror, Itiel E.	1
Dunn, James D.	1
Fromm, Davida	1
Growns, Bethany	1
Kai North	1
Karakaya, Ismail	1
Le An Ha	1
MacWhinney, Brian	1
May, Lyn	1
Peter Baldwin	1
Rensch, Amanda	1
Rowedder, Madyson	1
Salerno, Jessica M.	1
Sata, Mehmet	1
Schweitzer, N. J.	1
Stark, Brielle C.	1
Towler, Alice	1
Victoria Yaneva	1
Yiyun Zhou	1
More ▼