Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 8 |
Descriptor
Source
Language Testing | 8 |
Author
Han, Chao | 2 |
Albert Weideman | 1 |
Barkaoui, Khaled | 1 |
Bart Deygers | 1 |
Jarvis, Scott | 1 |
Laura Schildt | 1 |
O'Hagan, Sally | 1 |
Pill, John | 1 |
Razi, Salim | 1 |
Sahan, Özgür | 1 |
Wind, Stefanie A. | 1 |
More ▼ |
Publication Type
Journal Articles | 8 |
Reports - Research | 7 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 5 |
Postsecondary Education | 3 |
High Schools | 1 |
Secondary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Laura Schildt; Bart Deygers; Albert Weideman – Language Testing, 2024
In the context of policy-driven language testing for citizenship, a growing body of research examines the political justifications and ethical implications of language requirements and test use. However, virtually no studies have looked at the role that language testers play in the evolution of language requirements. Critical gaps remain in our…
Descriptors: Language Tests, Citizenship, Educational Policy, Assessment Literacy
Wind, Stefanie A. – Language Testing, 2023
Researchers frequently evaluate rater judgments in performance assessments for evidence of differential rater functioning (DRF), which occurs when rater severity is systematically related to construct-irrelevant student characteristics after controlling for student achievement levels. However, researchers have observed that methods for detecting…
Descriptors: Evaluators, Decision Making, Student Characteristics, Performance Based Assessment
Han, Chao; Xiao, Xiaoyan – Language Testing, 2022
The quality of sign language interpreting (SLI) is a gripping construct among practitioners, educators and researchers, calling for reliable and valid assessment. There has been a diverse array of methods in the extant literature to measure SLI quality, ranging from traditional error analysis to recent rubric scoring. In this study, we want to…
Descriptors: Comparative Analysis, Sign Language, Deaf Interpreting, Evaluators
Sahan, Özgür; Razi, Salim – Language Testing, 2020
This study examines the decision-making behaviors of raters with varying levels of experience while assessing EFL essays of distinct qualities. The data were collected from 28 raters with varying levels of rating experience and working at the English language departments of different universities in Turkey. Using a 10-point analytic rubric, each…
Descriptors: Decision Making, Essays, Writing Evaluation, Evaluators
Han, Chao – Language Testing, 2019
Summative assessment of interpretation is widely conducted in interpreting courses/programs to inform high-stakes decision making, such as the selection, certification, and conferral of academic degrees. Yet there has been very limited empirical research to investigate the score dependability of summative interpretation assessment. The present…
Descriptors: Generalization, Decision Making, Summative Evaluation, Evaluators
Jarvis, Scott – Language Testing, 2017
The present study discusses the relevance of measures of lexical diversity (LD) to the assessment of learner corpora. It also argues that existing measures of LD, many of which have become specialized for use with language corpora, are fundamentally measures of lexical repetition, are based on an etic perspective of language, and lack construct…
Descriptors: Computational Linguistics, English (Second Language), Second Language Learning, Native Speakers
O'Hagan, Sally; Pill, John; Zhang, Ying – Language Testing, 2016
Criticism of specific-purpose language (LSP) tests is often directed at their limited ability to represent fully the demands of the target language use situation. Such criticisms extend to the criteria used to assess test performance, which may fail to capture what matters to participants in the domain of interest. This paper reports on the…
Descriptors: Health Personnel, Language Tests, English for Special Purposes, Criticism
Barkaoui, Khaled – Language Testing, 2011
Think-aloud protocols (TAPs) are frequently used in research on essay rating processes. However, there are very few empirical studies of the completeness of TAP data and the effects of this technique on rater performance (i.e., rating processes and outcomes). This study aims to start to address this research gap. As part of a larger study on rater…
Descriptors: Protocol Analysis, Rating Scales, Essays, English (Second Language)