Publication Date
In 2025 | 2 |
Since 2024 | 4 |
Descriptor
Evaluation Methods | 4 |
Evaluation Criteria | 2 |
Evaluators | 2 |
Language Tests | 2 |
Reliability | 2 |
Second Language Learning | 2 |
Writing Evaluation | 2 |
Achievement Rating | 1 |
Causal Models | 1 |
College Students | 1 |
Comparative Analysis | 1 |
More ▼ |
Source
Language Testing | 4 |
Author
Heeyeon Yoon | 1 |
Huiying Cai | 1 |
John Pill | 1 |
Ping-Lin Chuang | 1 |
Rebecca Sickinger | 1 |
Tia M. Fechter | 1 |
Tineke Brunfaut | 1 |
Xun Yan | 1 |
Publication Type
Journal Articles | 4 |
Reports - Research | 4 |
Education Level
Higher Education | 1 |
Postsecondary Education | 1 |
Secondary Education | 1 |
Audience
Location
Austria | 1 |
Illinois (Urbana) | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024
This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…
Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods
Ping-Lin Chuang – Language Testing, 2025
This experimental study explores how source use features impact raters' judgment of argumentation in a second language (L2) integrated writing test. One hundred four experienced and novice raters were recruited to complete a rating task that simulated the scoring assignment of a local English Placement Test (EPT). Sixty written responses were…
Descriptors: Interrater Reliability, Evaluators, Information Sources, Primary Sources
Huiying Cai; Xun Yan – Language Testing, 2024
Rater comments tend to be qualitatively analyzed to indicate raters' application of rating scales. This study applied natural language processing (NLP) techniques to quantify meaningful, behavioral information from a corpus of rater comments and triangulated that information with a many-facet Rasch measurement (MFRM) analysis of rater scores. The…
Descriptors: Natural Language Processing, Item Response Theory, Rating Scales, Writing Evaluation
Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025
Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…
Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction