Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 11 |
Descriptor
Source
Language Testing | 11 |
Author
Knoch, Ute | 3 |
Chapelle, Carol A. | 1 |
Deygers, Bart | 1 |
Ginther, April | 1 |
Hattori, Tamaki | 1 |
Jason Fan | 1 |
Khamboonruang, Apichat | 1 |
Kunnan, Antony John | 1 |
Lee, Yong-Won | 1 |
Lv, Jing | 1 |
Maeda, Yukiko | 1 |
More ▼ |
Publication Type
Journal Articles | 11 |
Reports - Evaluative | 6 |
Information Analyses | 4 |
Reports - Research | 4 |
Opinion Papers | 1 |
Education Level
High Schools | 1 |
Audience
Location
Australia | 2 |
Japan | 1 |
Netherlands | 1 |
United Kingdom | 1 |
United States | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 2 |
International English… | 1 |
Test of English for… | 1 |
What Works Clearinghouse Rating
Ute Knoch; Jason Fan – Language Testing, 2024
While several test concordance tables have been published, the research underpinning such tables has rarely been examined in detail. This study aimed to survey the publically available studies or documentation underpinning the test concordance tables of the providers of four major international language tests, all accepted by the Australian…
Descriptors: Language Tests, English, Test Validity, Item Analysis
Knoch, Ute; Deygers, Bart; Khamboonruang, Apichat – Language Testing, 2021
Rating scale development in the field of language assessment is often considered in dichotomous ways: It is assumed to be guided either by expert intuition or by drawing on performance data. Even though quite a few authors have argued that rating scale development is rarely so easily classifiable, this dyadic view has dominated language testing…
Descriptors: Rating Scales, Test Construction, Language Tests, Test Use
Knoch, Ute; Chapelle, Carol A. – Language Testing, 2018
Argument-based validation requires test developers and researchers to specify what is entailed in test interpretation and use. Doing so has been shown to yield advantages (Chapelle, Enright, & Jamieson, 2010), but it also requires an analysis of how the concerns of language testers can be conceptualized in the terms used to construct a…
Descriptors: Test Validity, Language Tests, Evaluation Research, Rating Scales
Wind, Stefanie A.; Peterson, Meghan E. – Language Testing, 2018
The use of assessments that require rater judgment (i.e., rater-mediated assessments) has become increasingly popular in high-stakes language assessments worldwide. Using a systematic literature review, the purpose of this study is to identify and explore the dominant methods for evaluating rating quality within the context of research on…
Descriptors: Language Tests, Evaluators, Evaluation Methods, Interrater Reliability
Lee, Yong-Won – Language Testing, 2015
Diagnostic language assessment (DLA) is gaining a lot of attention from language teachers, testers, and applied linguists. With a recent surge of interest in DLA, there seems to be an urgent need to assess where the field of DLA stands at the moment and develop a general sense of where it should be moving in the future. The current article, as the…
Descriptors: Diagnostic Tests, Language Tests, Evaluation Research, Feedback (Response)
Elicited Imitation as a Measure of Second Language Proficiency: A Narrative Review and Meta-Analysis
Yan, Xun; Maeda, Yukiko; Lv, Jing; Ginther, April – Language Testing, 2016
Elicited imitation (EI) has been widely used to examine second language (L2) proficiency and development and was an especially popular method in the 1970s and early 1980s. However, as the field embraced more communicative approaches to both instruction and assessment, the use of EI diminished, and the construct-related validity of EI scores as a…
Descriptors: Second Language Learning, Language Proficiency, Meta Analysis, Effect Size
McNamara, Tim; Knoch, Ute – Language Testing, 2012
This paper examines the uptake of Rasch measurement in language testing through a consideration of research published in language testing research journals in the period 1984 to 2009. Following the publication of the first papers on this topic, exploring the potential of the simple Rasch model for the analysis of dichotomous language test data, a…
Descriptors: Language Tests, Testing, English (Second Language), Item Response Theory
Xi, Xiaoming – Language Testing, 2010
Previous test fairness frameworks have greatly expanded the scope of fairness, but do not provide a means to fully integrate fairness investigations and set priorities. This article proposes an approach to guide practitioners on fairness research and practices. This approach treats fairness as an aspect of validity and conceptualizes it as…
Descriptors: Test Results, Language Tests, Test Validity, English (Second Language)
Kunnan, Antony John – Language Testing, 2010
This paper presents the author's response to Xiaoming Xi's article titled "How do we go about investigating test fairness?" In this response, the author focuses on test fairness and Toulmin's model of argument structure, Xi's proposal, and the challenges the proposal brings. Xi proposes an approach to investigating test fairness to guide…
Descriptors: Persuasive Discourse, Inferences, Test Bias, Models
Saida, Chisato; Hattori, Tamaki – Language Testing, 2008
Despite growing concerns about declining scholastic abilities of Japanese students throughout Japan prior to the implementation of the revised Courses of Study in 2002, little empirical evidence was available at that time to support this perceived decline in academic performance. This research describes post-hoc IRT equating of previously…
Descriptors: Language Tests, Measures (Individuals), Foreign Countries, Item Response Theory
Pae, Tae-Il; Park, Gi-Pyo – Language Testing, 2006
The present study utilized both the IRT-LR (item response theory likelihood ratio) and a series of CFA (confirmatory factor analysis) multi-sample analyses to systematically examine the relationships between DIF (differential item functioning) and DTF (differential test functioning) with a random sample of 15 000 Korean examinees. Specifically,…
Descriptors: Item Response Theory, Factor Analysis, Test Bias, Test Validity