ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	4

Descriptor

English (Second Language)	4
Essays	4
Second Language Instruction	4
Second Language Learning	4
Evaluators	3
Scores	3
Writing Evaluation	3
Computer Software	2
Decision Making	2
Grammar	2
Language Proficiency	2
Language Teachers	2
Protocol Analysis	2
Scoring	2
Writing (Composition)	2
Artificial Intelligence	1
College Students	1
Comparative Analysis	1
Computational Linguistics	1
Computer Assisted Instruction	1
Electronic Mail	1
Error Correction	1
Ethics	1
Ethnicity	1
Feedback (Response)	1
More ▼

Source

Language Testing

Author

Barkaoui, Khaled	1
Chodorow, Martin	1
Gamon, Michael	1
Razi, Salim	1
Sahan, Özgür	1
Taichi Yamashita	1
Tetreault, Joel	1

Publication Type

Journal Articles	4
Reports - Research	2
Reports - Evaluative	1

Education Level

Higher Education	3
Postsecondary Education	1
Secondary Education	1

Audience

Location

Turkey

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 4 results Save | Export

Exploring Potential Biases in GPT-4o's Ratings of English Language Learners' Essays

Peer reviewed

Direct link

Taichi Yamashita – Language Testing, 2025

With the rapid development of generative artificial intelligence (AI) frameworks (e.g., the generative pre-trained transformer [GPT]), a growing number of researchers have started to explore its potential as an automated essay scoring (AES) system. While previous studies have investigated the alignment between human ratings and GPT ratings, few…

Descriptors: Artificial Intelligence, English (Second Language), Second Language Learning, Second Language Instruction

Do Experience and Text Quality Matter for Raters' Decision-Making Behaviors?

Peer reviewed

Direct link

Sahan, Özgür; Razi, Salim – Language Testing, 2020

This study examines the decision-making behaviors of raters with varying levels of experience while assessing EFL essays of distinct qualities. The data were collected from 28 raters with varying levels of rating experience and working at the English language departments of different universities in Turkey. Using a 10-point analytic rubric, each…

Descriptors: Decision Making, Essays, Writing Evaluation, Evaluators

Think-Aloud Protocols in Research on Essay Rating: An Empirical Study of Their Veridicality and Reactivity

Peer reviewed

Direct link

Barkaoui, Khaled – Language Testing, 2011

Think-aloud protocols (TAPs) are frequently used in research on essay rating processes. However, there are very few empirical studies of the completeness of TAP data and the effects of this technique on rater performance (i.e., rating processes and outcomes). This study aims to start to address this research gap. As part of a larger study on rater…

Descriptors: Protocol Analysis, Rating Scales, Essays, English (Second Language)

The Utility of Article and Preposition Error Correction Systems for English Language Learners: Feedback and Assessment

Peer reviewed

Direct link

Chodorow, Martin; Gamon, Michael; Tetreault, Joel – Language Testing, 2010

In this paper, we describe and evaluate two state-of-the-art systems for identifying and correcting writing errors involving English articles and prepositions. Criterion[superscript SM], developed by Educational Testing Service, and "ESL Assistant", developed by Microsoft Research, both use machine learning techniques to build models of article…

Descriptors: Grammar, Feedback (Response), Form Classes (Languages), Second Language Learning