ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	6
Since 2007 (last 20 years)	7

Source

Language Testing

Author

Attali, Yigal	1
Gierl, Mark J.	1
Janssen, Gerriet	1
John Pill	1
Lewis, Will	1
Lidster, Ryan	1
Lin, Chih-Kai	1
Meier, Valerie	1
Olson, Daniel J.	1
Rebecca Sickinger	1
Shin, Jinnie	1
Shin, Sun-Young	1
Steier, Michael	1
Tineke Brunfaut	1
Trace, Jonathan	1
More ▼

Publication Type

Journal Articles	7
Reports - Research	6
Reports - Evaluative	1

Education Level

Higher Education	2
Postsecondary Education	2
Secondary Education	1

Audience

Location

Austria	1
Colombia	1

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations

What Works Clearinghouse Rating

Showing all 7 results Save | Export

Comparative Judgement for Evaluating Young Learners' EFL Writing Performances: Reliability and Teacher Perceptions of Holistic and Dimension-Based Judgements

Peer reviewed

Direct link

Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025

Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…

Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction

More Efficient Processes for Creating Automated Essay Scoring Frameworks: A Demonstration of Two Algorithms

Peer reviewed

Direct link

Shin, Jinnie; Gierl, Mark J. – Language Testing, 2021

Automated essay scoring (AES) has emerged as a secondary or as a sole marker for many high-stakes educational assessments, in native and non-native testing, owing to remarkable advances in feature engineering using natural language processing, machine learning, and deep-neural algorithms. The purpose of this study is to compare the effectiveness…

Descriptors: Scoring, Essays, Writing Evaluation, Computer Software

Measuring Bilingual Language Dominance: An Examination of the Reliability of the Bilingual Language Profile

Peer reviewed

Direct link

Olson, Daniel J. – Language Testing, 2023

Measuring language dominance, broadly defined as the relative strength of each of a bilingual's two languages, remains a crucial methodological issue in bilingualism research. While various methods have been proposed, the Bilingual Language Profile (BLP) has been one of the most widely used tools for measuring language dominance. While previous…

Descriptors: Bilingualism, Language Dominance, Native Language, Second Language Learning

Scoring with the Computer: Alternative Procedures for Improving the Reliability of Holistic Essay Scoring

Peer reviewed

Direct link

Attali, Yigal; Lewis, Will; Steier, Michael – Language Testing, 2013

Automated essay scoring can produce reliable scores that are highly correlated with human scores, but is limited in its evaluation of content and other higher-order aspects of writing. The increased use of automated essay scoring in high-stakes testing underscores the need for human scoring that is focused on higher-order aspects of writing. This…

Descriptors: Scoring, Essay Tests, Reliability, High Stakes Tests

Working with Sparse Data in Rated Language Tests: Generalizability Theory Applications

Peer reviewed

Direct link

Lin, Chih-Kai – Language Testing, 2017

Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…

Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy

Measuring the Impact of Rater Negotiation in Writing Performance Assessment

Peer reviewed

Direct link

Trace, Jonathan; Janssen, Gerriet; Meier, Valerie – Language Testing, 2017

Previous research in second language writing has shown that when scoring performance assessments even trained raters can exhibit significant differences in severity. When raters disagree, using discussion to try to reach a consensus is one popular form of score resolution, particularly in contexts with limited resources, as it does not require…

Descriptors: Performance Based Assessment, Second Language Learning, Scoring, Evaluators

Evaluating Different Standard-Setting Methods in an ESL Placement Testing Context

Peer reviewed

Direct link

Shin, Sun-Young; Lidster, Ryan – Language Testing, 2017

In language programs, it is crucial to place incoming students into appropriate levels to ensure that course curriculum and materials are well targeted to their learning needs. Deciding how and where to set cutscores on placement tests is thus of central importance to programs, but previous studies in educational measurement disagree as to which…

Descriptors: Language Tests, English (Second Language), Standard Setting (Scoring), Student Placement

Reliability	7
Scoring	6
Scores	5
Comparative Analysis	3
English (Second Language)	3
Evaluators	3
Language Tests	3
Second Language Learning	3
English for Academic Purposes	2
Evaluation Methods	2
Foreign Countries	2
High Stakes Tests	2
Interrater Reliability	2
Performance Based Assessment	2
Validity	2
Writing Evaluation	2
Accuracy	1
Artificial Intelligence	1
Automation	1
Bias	1
Bilingualism	1
College Entrance Examinations	1
College Second Language…	1
College Students	1
Computer Software	1
More ▼