Publication Date
In 2025 | 2 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 6 |
Since 2016 (last 10 years) | 8 |
Since 2006 (last 20 years) | 12 |
Descriptor
Scoring | 13 |
Writing Evaluation | 13 |
Evaluators | 8 |
Second Language Learning | 8 |
English (Second Language) | 7 |
Essays | 6 |
Scores | 5 |
Language Tests | 4 |
Writing Skills | 4 |
Comparative Analysis | 3 |
Computer Assisted Testing | 3 |
More ▼ |
Source
Language Testing | 13 |
Author
Bachman, Lyle F. | 1 |
Bae, Jungok | 1 |
Bond, Trevor | 1 |
Chan, Kinnie Kin Yee | 1 |
Chan, Sathena | 1 |
Enright, Mary K. | 1 |
Gierl, Mark | 1 |
Gierl, Mark J. | 1 |
In'nami, Yo | 1 |
John Pill | 1 |
Koizumi, Rie | 1 |
More ▼ |
Publication Type
Journal Articles | 13 |
Reports - Research | 10 |
Reports - Descriptive | 2 |
Information Analyses | 1 |
Reports - Evaluative | 1 |
Education Level
Secondary Education | 4 |
Elementary Education | 2 |
High Schools | 1 |
Higher Education | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Postsecondary Education | 1 |
Audience
Location
Austria | 1 |
Netherlands | 1 |
Turkey | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025
Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…
Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction
Shin, Jinnie; Gierl, Mark J. – Language Testing, 2021
Automated essay scoring (AES) has emerged as a secondary or as a sole marker for many high-stakes educational assessments, in native and non-native testing, owing to remarkable advances in feature engineering using natural language processing, machine learning, and deep-neural algorithms. The purpose of this study is to compare the effectiveness…
Descriptors: Scoring, Essays, Writing Evaluation, Computer Software
Chan, Kinnie Kin Yee; Bond, Trevor; Yan, Zi – Language Testing, 2023
We investigated the relationship between the scores assigned by an Automated Essay Scoring (AES) system, the Intelligent Essay Assessor (IEA), and grades allocated by trained, professional human raters to English essay writing by instigating two procedures novel to written-language assessment: the logistic transformation of AES raw scores into…
Descriptors: Computer Assisted Testing, Essays, Scoring, Scores
Taichi Yamashita – Language Testing, 2025
With the rapid development of generative artificial intelligence (AI) frameworks (e.g., the generative pre-trained transformer [GPT]), a growing number of researchers have started to explore its potential as an automated essay scoring (AES) system. While previous studies have investigated the alignment between human ratings and GPT ratings, few…
Descriptors: Artificial Intelligence, English (Second Language), Second Language Learning, Second Language Instruction
Latifi, Syed; Gierl, Mark – Language Testing, 2021
An automated essay scoring (AES) program is a software system that uses techniques from corpus and computational linguistics and machine learning to grade essays. In this study, we aimed to describe and evaluate particular language features of Coh-Metrix for a novel AES program that would score junior and senior high school students' essays from…
Descriptors: Writing Evaluation, Computer Assisted Testing, Scoring, Essays
Chan, Sathena; May, Lyn – Language Testing, 2023
Despite the increased use of integrated tasks in high-stakes academic writing assessment, research on rating criteria which reflect the unique construct of integrated summary writing skills is comparatively rare. Using a mixed-method approach of expert judgement, text analysis, and statistical analysis, this study examines writing features that…
Descriptors: Scoring, Writing Evaluation, Reading Tests, Listening Skills
Sahan, Özgür; Razi, Salim – Language Testing, 2020
This study examines the decision-making behaviors of raters with varying levels of experience while assessing EFL essays of distinct qualities. The data were collected from 28 raters with varying levels of rating experience and working at the English language departments of different universities in Turkey. Using a 10-point analytic rubric, each…
Descriptors: Decision Making, Essays, Writing Evaluation, Evaluators
In'nami, Yo; Koizumi, Rie – Language Testing, 2016
We addressed Deville and Chalhoub-Deville's (2006), Schoonen's (2012), and Xi and Mollaun's (2006) call for research into the contextual features that are considered related to person-by-task interactions in the framework of generalizability theory in two ways. First, we quantitatively synthesized the generalizability studies to determine the…
Descriptors: Evaluators, Second Language Learning, Writing Skills, Oral Language
Kuiken, Folkert; Vedder, Ineke – Language Testing, 2014
This study investigates the relationship in L2 writing between raters' judgments of communicative adequacy and linguistic complexity by means of six-point Likert scales, and general measures of linguistic performance. The participants were 39 learners of Italian and 32 of Dutch, who wrote two short argumentative essays. The same writing tasks…
Descriptors: Writing Evaluation, Second Language Learning, Evaluators, Native Language
Enright, Mary K.; Quinlan, Thomas – Language Testing, 2010
E-rater[R] is an automated essay scoring system that uses natural language processing techniques to extract features from essays and to model statistically human holistic ratings. Educational Testing Service has investigated the use of e-rater, in conjunction with human ratings, to score one of the two writing tasks on the TOEFL-iBT[R] writing…
Descriptors: Second Language Learning, Scoring, Essays, Language Processing
Bae, Jungok; Bachman, Lyle F. – Language Testing, 2010
This study investigated the validity of four theoretically motivated traits of writing ability across English and Korean, based on elementary school students' responses to letter- and story-writing tasks. Their responses were scored analytically and analyzed using confirmatory factor analysis. The findings include the following. A model of writing…
Descriptors: Elementary School Students, Validity, Korean, English (Second Language)
Yu, Guoxing – Language Testing, 2007
Two kinds of scoring templates were empirically derived from summaries written by experts and students to evaluate the quality of summaries written by the students. This paper reports students' attitudes towards the use of the two templates and its differential statistical effects on the judgment of students' summarization performance. It was…
Descriptors: Student Evaluation, Student Attitudes, Democracy, Educational Assessment

Schoonen, Rob; And Others – Language Testing, 1997
Reports on three studies conducted in the Netherlands about the reading reliability of lay and expert readers in rating content and language usage of students' writing performances in three kinds of writing assignments. Findings reveal that expert readers are more reliable in rating usage, whereas both lay and expert readers are reliable raters of…
Descriptors: Foreign Countries, Interrater Reliability, Language Usage, Models