ERIC - Search Results

Publication Date

In 2025	2
Since 2024	4
Since 2021 (last 5 years)	8
Since 2016 (last 10 years)	12
Since 2006 (last 20 years)	20

Source

Language Testing

Publication Type

Journal Articles	37
Reports - Research	37
Reports - Descriptive	2
Information Analyses	1
Tests/Questionnaires	1

Education Level

Higher Education	9
Postsecondary Education	5
Secondary Education	1

Audience

Location

Australia	4
China	3
United Kingdom (England)	2
Austria	1
Canada	1
Chile	1
Hong Kong	1
Illinois (Urbana)	1
Iran	1
Italy	1
Japan	1
Norway	1
Taiwan	1
Turkey	1
United Kingdom (Wales)	1
United States	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	2
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 37 results Save | Export

Evaluating Methodological Enhancements to the Yes/No Angoff Standard-Setting Method in Language Proficiency Assessment

Peer reviewed

Direct link

Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024

This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…

Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods

Do Source Use Features Impact Raters' Judgment of Argumentation? An Experimental Study

Peer reviewed

Direct link

Ping-Lin Chuang – Language Testing, 2025

This experimental study explores how source use features impact raters' judgment of argumentation in a second language (L2) integrated writing test. One hundred four experienced and novice raters were recruited to complete a rating task that simulated the scoring assignment of a local English Placement Test (EPT). Sixty written responses were…

Descriptors: Interrater Reliability, Evaluators, Information Sources, Primary Sources

Triangulating Natural Language Processing (NLP)-Based Analysis of Rater Comments and Many-Facet Rasch Measurement (MFRM): An Innovative Approach to Investigating Raters' Application of Rating Scales in Writing Assessment

Peer reviewed

Direct link

Huiying Cai; Xun Yan – Language Testing, 2024

Rater comments tend to be qualitatively analyzed to indicate raters' application of rating scales. This study applied natural language processing (NLP) techniques to quantify meaningful, behavioral information from a corpus of rater comments and triangulated that information with a many-facet Rasch measurement (MFRM) analysis of rater scores. The…

Descriptors: Natural Language Processing, Item Response Theory, Rating Scales, Writing Evaluation

Comparative Judgement for Evaluating Young Learners' EFL Writing Performances: Reliability and Teacher Perceptions of Holistic and Dimension-Based Judgements

Peer reviewed

Direct link

Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025

Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…

Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction

But Who Trains the Language Teacher Educator Who Trains the Language Teacher? An Empirical Investigation of Chilean EFL Teacher Educators' Language Assessment Literacy

Peer reviewed

Direct link

Villa Larenas, Salomé; Brunfaut, Tineke – Language Testing, 2023

Research has shown that language teachers typically feel underprepared for assessment aspects of their job. One reason may relate to how teacher education programmes prepare future teachers in this area. Research insights into how and to what extent teacher educators train future language teachers in language assessment matters are scarce,…

Descriptors: Foreign Countries, Second Language Instruction, Language Teachers, Teacher Educators

Towards More Valid Scoring Criteria for Integrated Reading-Writing and Listening-Writing Summary Tasks

Peer reviewed

Direct link

Chan, Sathena; May, Lyn – Language Testing, 2023

Despite the increased use of integrated tasks in high-stakes academic writing assessment, research on rating criteria which reflect the unique construct of integrated summary writing skills is comparatively rare. Using a mixed-method approach of expert judgement, text analysis, and statistical analysis, this study examines writing features that…

Descriptors: Scoring, Writing Evaluation, Reading Tests, Listening Skills

A Systematic Review of Methods for Evaluating Rating Quality in Language Assessment

Peer reviewed

Direct link

Wind, Stefanie A.; Peterson, Meghan E. – Language Testing, 2018

The use of assessments that require rater judgment (i.e., rater-mediated assessments) has become increasingly popular in high-stakes language assessments worldwide. Using a systematic literature review, the purpose of this study is to identify and explore the dominant methods for evaluating rating quality within the context of research on…

Descriptors: Language Tests, Evaluators, Evaluation Methods, Interrater Reliability

A Comparative Judgment Approach to Assessing Chinese Sign Language Interpreting

Peer reviewed

Direct link

Han, Chao; Xiao, Xiaoyan – Language Testing, 2022

The quality of sign language interpreting (SLI) is a gripping construct among practitioners, educators and researchers, calling for reliable and valid assessment. There has been a diverse array of methods in the extant literature to measure SLI quality, ranging from traditional error analysis to recent rubric scoring. In this study, we want to…

Descriptors: Comparative Analysis, Sign Language, Deaf Interpreting, Evaluators

Reference to a Past Learning Event as a Practice of Informal Formative Assessment in L2 Classroom Interaction

Peer reviewed

Direct link

Can Daskin, Nilüfer; Hatipoglu, Çiler – Language Testing, 2019

In this study we are concerned with the informal dimension of formative assessment (FA) in an L2 classroom. We examine those instances that are embedded into everyday learning activities and that emerge in and through classroom interaction contingently, continuously and flexibly. Drawing on the methodological underpinnings of Conversation Analysis…

Descriptors: Formative Evaluation, Classroom Communication, Second Language Learning, Evaluation Methods

Critical Language Assessment Literacy of EFL Teachers: Scale Construction and Validation

Peer reviewed

Direct link

Tajeddin, Zia; Khatib, Mohammad; Mahdavi, Mohsen – Language Testing, 2022

Critical language assessment (CLA) has been addressed in numerous studies. However, the majority of the studies have overlooked the need for a practical framework to measure the CLA dimension of teachers' language assessment literacy (LAL). This gap prompted us to develop and validate a critical language assessment literacy (CLAL) scale to further…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Language Tests

Working with Sparse Data in Rated Language Tests: Generalizability Theory Applications

Peer reviewed

Direct link

Lin, Chih-Kai – Language Testing, 2017

Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…

Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy

Functional Adequacy in L2 Writing: Towards a New Rating Scale

Peer reviewed

Direct link

Kuiken, Folkert; Vedder, Ineke – Language Testing, 2017

The importance of functional adequacy as an essential component of L2 proficiency has been observed by several authors (Pallotti, 2009; De Jong, Steinel, Florijn, Schoonen, & Hulstijn, 2012a, b). The rationale underlying the present study is that the assessment of writing proficiency in L2 is not fully possible without taking into account the…

Descriptors: Second Language Learning, Rating Scales, Computational Linguistics, Persuasive Discourse

Validity Argument for Assessing L2 Pragmatics in Interaction Using Mixed Methods

Peer reviewed

Direct link

Youn, Soo Jung – Language Testing, 2015

This study investigates the validity of assessing L2 pragmatics in interaction using mixed methods, focusing on the evaluation inference. Open role-plays that are meaningful and relevant to the stakeholders in an English for Academic Purposes context were developed for classroom assessment. For meaningful score interpretations and accurate…

Descriptors: Second Language Learning, Pragmatics, Validity, Mixed Methods Research

Partial Dictation as a Measure of EFL Listening Proficiency: Evidence from Confirmatory Factor Analysis

Peer reviewed

Direct link

Cai, Hongwen – Language Testing, 2013

Partial dictation is a measure of EFL listening proficiency that can be easily constructed, administered, and scored by EFL teachers. However, it is controversial whether this form of test measures lower-order abilities exclusively or involves both lower- and higher-order abilities. In order to answer this question, a study was designed to examine…

Descriptors: Factor Analysis, Listening Comprehension Tests, English (Second Language), Foreign Countries

Developing a Comprehensive, Empirically Based Research Framework for Classroom-Based Assessment

Peer reviewed

Direct link

Hill, Kathryn; McNamara, Tim – Language Testing, 2012

This paper presents a comprehensive framework for researching classroom-based assessment (CBA) processes, and is based on a detailed empirical study of two Australian school classrooms where students aged 11 to 13 were studying Indonesian as a foreign language. The framework can be considered innovative in several respects. It goes beyond the…

Descriptors: Student Evaluation, Second Language Learning, Classroom Environment, Literacy

Previous Page | Next Page »

Pages: 1 | 2 | 3

McNamara, Tim	3
Brindley, Geoff	1
Brunfaut, Tineke	1
Byrnes, Heidi	1
Cai, Hongwen	1
Can Daskin, Nilüfer	1
Cardell, Elizabeth A.	1
Chan, Sathena	1
Chenery, Helen J.	1
Cheng, Liying	1
Elder, Catherine	1
Gattullo, Francesca	1
Grisay, Aletta	1
Han, Chao	1
Harlen, Wynne	1
Hasselgren, Angela	1
Hatipoglu, Çiler	1
Heeyeon Yoon	1
Hill, Kathryn	1
Hu, Huiqin	1
Huang, Shu-Chen	1
Huiying Cai	1
Iwashita, Noriko	1
Jang, Eunice Eunhee	1
John Pill	1
More ▼

Evaluation Methods	37
Second Language Learning	29
Language Tests	21
English (Second Language)	18
Foreign Countries	18
Student Evaluation	15
Testing	10
Evaluators	9
Language Proficiency	8
Second Language Instruction	8
College Students	5
Oral Language	5
Scoring	5
Validity	5
Writing Evaluation	5
Classroom Environment	4
Decision Making	4
Higher Education	4
Interrater Reliability	4
Language Processing	4
Language Teachers	4
Rating Scales	4
Alternative Assessment	3
Comparative Analysis	3
Elementary Education	3
More ▼