ERIC - Search Results

Publication Date

In 2025	1
Since 2024	9
Since 2021 (last 5 years)	21
Since 2016 (last 10 years)	29
Since 2006 (last 20 years)	42

Descriptor

Rating Scales	54
Language Tests	42
Second Language Learning	39
English (Second Language)	31
Language Proficiency	21
Foreign Countries	18
Evaluators	16
Scores	15
Writing Evaluation	14
Oral Language	12
Second Language Instruction	12
Speech Communication	12
Testing	10
Correlation	9
Interrater Reliability	9
College Students	8
Grammar	8
Language Teachers	8
Scoring	8
Test Validity	8
Guidelines	7
Language Usage	7
Test Construction	7
Writing Tests	7
Comparative Analysis	6
More ▼

Source

Language Testing

Publication Type

Journal Articles	54
Reports - Research	41
Reports - Evaluative	9
Tests/Questionnaires	5
Information Analyses	2
Opinion Papers	2
Numerical/Quantitative Data	1
Reports - Descriptive	1

Education Level

Higher Education	13
Postsecondary Education	7
Secondary Education	2
Elementary Education	1
Elementary Secondary Education	1

Audience

Location

Europe	4
United Kingdom	3
Canada	2
China	2
Iran	2
Netherlands	2
Arizona	1
Asia	1
Finland	1
Hawaii	1
Hong Kong	1
Israel	1
Japan	1
Middle East	1
South America	1
Turkey	1
United Kingdom (England)	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	5
ACTFL Oral Proficiency…	2
Test of English for…	2
International English…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 54 results Save | Export

Comparing Two Formats of Data-Driven Rating Scales for Classroom Assessment of Pragmatic Performance with Roleplays

Peer reviewed

Direct link

Yunwen Su; Sun-Young Shin – Language Testing, 2024

Rating scales that language testers design should be tailored to the specific test purpose and score use as well as reflect the target construct. Researchers have long argued for the value of data-driven scales for classroom performance assessment, because they are specific to pedagogical tasks and objectives, have rich descriptors to offer useful…

Descriptors: Rating Scales, Language Tests, Test Construction, Performance Based Assessment

Triangulating Natural Language Processing (NLP)-Based Analysis of Rater Comments and Many-Facet Rasch Measurement (MFRM): An Innovative Approach to Investigating Raters' Application of Rating Scales in Writing Assessment

Peer reviewed

Direct link

Huiying Cai; Xun Yan – Language Testing, 2024

Rater comments tend to be qualitatively analyzed to indicate raters' application of rating scales. This study applied natural language processing (NLP) techniques to quantify meaningful, behavioral information from a corpus of rater comments and triangulated that information with a many-facet Rasch measurement (MFRM) analysis of rater scores. The…

Descriptors: Natural Language Processing, Item Response Theory, Rating Scales, Writing Evaluation

Assessing Speaking through Multimodal Oral Presentations: The Case of Construct Underrepresentation in EAP Contexts

Peer reviewed

Direct link

Louise Palmour – Language Testing, 2024

This article explores the nature of the construct underlying classroom-based English for academic purpose (EAP) oral presentation assessments, which are used, in part, to determine admission to programmes of study at UK universities. Through analysis of qualitative data (from questionnaires, interviews, rating discussions, and fieldnotes), the…

Descriptors: English for Academic Purposes, Public Speaking, College Students, Foreign Countries

Diagnosing Chinese EFL Learners' Writing Ability Using Polytomous Cognitive Diagnostic Models

Peer reviewed

Direct link

Xiaoting Shi; Xiaomei Ma; Wenbo Du; Xuliang Gao – Language Testing, 2024

Cognitive diagnostic assessment (CDA) intends to identify learners' strengths and weaknesses in latent cognitive attributes to provide personalized remedial instructions. Previous CDA studies on English as a Foreign Language (EFL)/English as a Second Language (ESL) writing have adopted dichotomous cognitive diagnostic models (CDMs) to analyze data…

Descriptors: Writing Evaluation, Writing Tests, Diagnostic Tests, English (Second Language)

Assessing the Content Quality of Essays in Content and Language Integrated Learning: Exploring the Construct from Subject Specialists' Perspectives

Peer reviewed

Direct link

Takanori Sato – Language Testing, 2024

Assessing the content of learners' compositions is a common practice in second language (L2) writing assessment. However, the construct definition of content in L2 writing assessment potentially underrepresents the target competence in content and language integrated learning (CLIL), which aims to foster not only L2 proficiency but also critical…

Descriptors: Language Tests, Content and Language Integrated Learning, Writing Evaluation, Writing Tests

Revisiting Rating Scale Development for Rater-Mediated Language Performance Assessments: Modelling Construct and Contextual Choices Made by Scale Developers

Peer reviewed

Direct link

Knoch, Ute; Deygers, Bart; Khamboonruang, Apichat – Language Testing, 2021

Rating scale development in the field of language assessment is often considered in dichotomous ways: It is assumed to be guided either by expert intuition or by drawing on performance data. Even though quite a few authors have argued that rating scale development is rarely so easily classifiable, this dyadic view has dominated language testing…

Descriptors: Rating Scales, Test Construction, Language Tests, Test Use

Making Each Point Count: Revising a Local Adaptation of the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE Rubric

Peer reviewed

Direct link

Yu-Tzu Chang; Ann Tai Choe; Daniel Holden; Daniel R. Isbell – Language Testing, 2024

In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016-2021, including 434 essays) using…

Descriptors: Language Tests, Rating Scales, Second Language Learning, English (Second Language)

A New Scoring Method for Item Response Theory Analysis of C-Tests

Peer reviewed

Direct link

Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025

This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…

Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction

Operationalizing the Reading-into-Writing Construct in Analytic Rating Scales: Effects of Different Approaches on Rating

Peer reviewed

Direct link

Lestari, Santi B.; Brunfaut, Tineke – Language Testing, 2023

Assessing integrated reading-into-writing task performances is known to be challenging, and analytic rating scales have been found to better facilitate the scoring of these performances than other common types of rating scales. However, little is known about how specific operationalizations of the reading-into-writing construct in analytic rating…

Descriptors: Reading Writing Relationship, Writing Tests, Rating Scales, Writing Processes

Setting Standards for a Diagnostic Test of Aviation English for Student Pilots

Peer reviewed

Direct link

Maria Treadaway; John Read – Language Testing, 2024

Standard-setting is an essential component of test development, supporting the meaningfulness and appropriate interpretation of test scores. However, in the high-stakes testing environment of aviation, standard-setting studies are underexplored. To address this gap, we document two stages in the standard-setting procedures for the Overseas Flight…

Descriptors: Standard Setting, Diagnostic Tests, High Stakes Tests, English for Special Purposes

Validation of Rating Processes within an Argument-Based Framework

Peer reviewed

Direct link

Knoch, Ute; Chapelle, Carol A. – Language Testing, 2018

Argument-based validation requires test developers and researchers to specify what is entailed in test interpretation and use. Doing so has been shown to yield advantages (Chapelle, Enright, & Jamieson, 2010), but it also requires an analysis of how the concerns of language testers can be conceptualized in the terms used to construct a…

Descriptors: Test Validity, Language Tests, Evaluation Research, Rating Scales

Towards More Valid Scoring Criteria for Integrated Reading-Writing and Listening-Writing Summary Tasks

Peer reviewed

Direct link

Chan, Sathena; May, Lyn – Language Testing, 2023

Despite the increased use of integrated tasks in high-stakes academic writing assessment, research on rating criteria which reflect the unique construct of integrated summary writing skills is comparatively rare. Using a mixed-method approach of expert judgement, text analysis, and statistical analysis, this study examines writing features that…

Descriptors: Scoring, Writing Evaluation, Reading Tests, Listening Skills

Developing a Level-Specific Checklist for Assessing EFL Writing

Peer reviewed

Direct link

Lukácsi, Zoltán – Language Testing, 2021

In second language writing assessment, rating scales and scores from human-mediated assessment have been criticized for a number of shortcomings including problems with adequacy, relevance, and reliability (Hamp-Lyons, 1990; McNamara, 1996; Weigle, 2002). In its testing practice, Euroexam International also detected that the rating scales for…

Descriptors: Test Construction, Test Validity, Test Items, Check Lists

Temporal Fluency and Floor/Ceiling Scoring of Intermediate and Advanced Speech on the ACTFL Spanish Oral Proficiency Interview--Computer

Peer reviewed

Direct link

Cox, Troy L.; Brown, Alan V.; Thompson, Gregory L. – Language Testing, 2023

The rating of proficiency tests that use the Inter-agency Roundtable (ILR) and American Council on the Teaching of Foreign Languages (ACTFL) guidelines claims that each major level is based on hierarchal linguistic functions that require mastery of multidimensional traits in such a way that each level subsumes the levels beneath it. These…

Descriptors: Oral Language, Language Fluency, Scoring, Cues

The Effect of Response Order on Candidate Viewing Behaviour and Item Difficulty in a Multiple-Choice Listening Test

Peer reviewed

Direct link

Holzknecht, Franz; McCray, Gareth; Eberharter, Kathrin; Kremmel, Benjamin; Zehentner, Matthias; Spiby, Richard; Dunlea, Jamie – Language Testing, 2021

Studies from various disciplines have reported that spatial location of options in relation to processing order impacts the ultimate choice of the option. A large number of studies have found a primacy effect, that is, the tendency to prefer the first option. In this paper we report on evidence that position of the key in four-option…

Descriptors: Language Tests, Test Items, Multiple Choice Tests, Listening Comprehension Tests

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Knoch, Ute	4
Davidson, Fred	2
Deygers, Bart	2
Elder, Catherine	2
Fulcher, Glenn	2
Henning, Grant	2
May, Lyn	2
Alanen, Riikka	1
Ann Tai Choe	1
Barkaoui, Khaled	1
Barkhuizen, Gary	1
Bridgeman, Brent	1
Brooks, Lindsay	1
Brown, Alan V.	1
Brown, Annie	1
Brunfaut, Tineke	1
Chalhoub-Deville, Micheline	1
Chan, Sathena	1
Chapelle, Carol A.	1
Coniam, David	1
Cox, Troy L.	1
Daniel Holden	1
Daniel R. Isbell	1
Ducasse, Ana Maria	1
More ▼