Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 10 |
| Since 2007 (last 20 years) | 30 |
Descriptor
Source
| Language Testing | 30 |
Author
| Lim, Gad S. | 2 |
| Xi, Xiaoming | 2 |
| Anderson, Carolyn | 1 |
| Barkaoui, Khaled | 1 |
| Briggs, Sarah L. | 1 |
| Cheng, Junyu | 1 |
| Dimova, Slobodanka | 1 |
| Do, Juhyun | 1 |
| Doe, Christine | 1 |
| Eckes, Thomas | 1 |
| Ginther, April | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 30 |
| Reports - Research | 25 |
| Reports - Evaluative | 5 |
| Tests/Questionnaires | 2 |
| Opinion Papers | 1 |
Education Level
| Higher Education | 8 |
| Postsecondary Education | 3 |
| Secondary Education | 2 |
| Elementary Education | 1 |
| Elementary Secondary Education | 1 |
| Grade 6 | 1 |
| High Schools | 1 |
| Intermediate Grades | 1 |
| Middle Schools | 1 |
Audience
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
| Test of English as a Foreign… | 4 |
| Clinical Evaluation of… | 1 |
| Gates MacGinitie Reading Tests | 1 |
| International English… | 1 |
| Modern Language Aptitude Test | 1 |
| Wechsler Individual… | 1 |
What Works Clearinghouse Rating
Yunwen Su; Sun-Young Shin – Language Testing, 2024
Rating scales that language testers design should be tailored to the specific test purpose and score use as well as reflect the target construct. Researchers have long argued for the value of data-driven scales for classroom performance assessment, because they are specific to pedagogical tasks and objectives, have rich descriptors to offer useful…
Descriptors: Rating Scales, Language Tests, Test Construction, Performance Based Assessment
Wind, Stefanie A. – Language Testing, 2023
Researchers frequently evaluate rater judgments in performance assessments for evidence of differential rater functioning (DRF), which occurs when rater severity is systematically related to construct-irrelevant student characteristics after controlling for student achievement levels. However, researchers have observed that methods for detecting…
Descriptors: Evaluators, Decision Making, Student Characteristics, Performance Based Assessment
Sok, Sarah; Shin, Hye Won; Do, Juhyun – Language Testing, 2021
Test-taker characteristics (TTCs), or individual difference variables, are known to be a systematic source of variance in language test performance. Although previous research has documented the impact of a range of TTCs on second language (L2) learners' test performance, few of these studies have involved young learners. Given that young L2…
Descriptors: Listening Comprehension Tests, Reading Comprehension, Performance Factors, Elementary School Students
Huang, Heng-Tsung Danny; Hung, Shao-Ting Alan; Plakans, Lia – Language Testing, 2018
Integrated speaking test tasks (integrated tasks) provide reading and/or listening input to serve as the basis for test-takers to formulate their oral responses. This study examined the influence of topical knowledge on integrated speaking test performance and compared independent speaking test performance and integrated speaking test performance…
Descriptors: Language Tests, Speech Tests, Comparative Analysis, English (Second Language)
Lin, Chih-Kai – Language Testing, 2017
Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…
Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy
van Batenburg, Eline S. L.; Oostdam, Ron J.; van Gelderen, Amos J. S.; de Jong, Nivja H. – Language Testing, 2018
This article explores ways to assess interactional performance, and reports on the use of a test format that standardizes the interlocutor's linguistic and interactional contributions to the exchange. It describes the construction and administration of six scripted speech tasks (instruction, advice, and sales tasks) with pre-vocational learners (n…
Descriptors: Second Language Learning, Speech Tests, Interaction, Test Reliability
Trace, Jonathan; Janssen, Gerriet; Meier, Valerie – Language Testing, 2017
Previous research in second language writing has shown that when scoring performance assessments even trained raters can exhibit significant differences in severity. When raters disagree, using discussion to try to reach a consensus is one popular form of score resolution, particularly in contexts with limited resources, as it does not require…
Descriptors: Performance Based Assessment, Second Language Learning, Scoring, Evaluators
Khabbazbashi, Nahal – Language Testing, 2017
This study explores the extent to which topic and background knowledge of topic affect spoken performance in a high-stakes speaking test. It is argued that evidence of a substantial influence may introduce construct-irrelevant variance and undermine test fairness. Data were collected from 81 non-native speakers of English who performed on 10…
Descriptors: Speech Tests, High Stakes Tests, English (Second Language), Language Proficiency
Cheng, Junyu; Matthews, Joshua – Language Testing, 2018
This study explores the constructs that underpin three different measures of vocabulary knowledge and investigates the degree to which these three measures correlate with, and are able to predict, measures of second language (L2) listening and reading. Word frequency structured vocabulary tests tapping "receptive/orthographic (RecOrth)…
Descriptors: Listening Comprehension, Reading Comprehension, Reading Tests, Correlation
Hoekje, Barbara – Language Testing, 2016
This commentary argues that the OET research raises inescapable contradictions in trying to separate "language" from "communication" within a weak performance test and advocates for reconceptualizing the legitimate domain of "language" more widely, reclaiming the full potential of the communicative competence…
Descriptors: Language Tests, Languages for Special Purposes, Second Language Learning, Communicative Competence (Languages)
Morita-Mullaney, Trish – Language Testing, 2017
English language proficiency or English language development (ELP/D) standards guide how content-specific instruction and assessment is practiced by teachers and how English learners (ELs) at varying levels of English proficiency can perform grade-level-specific academic standards in K-12 US schools. With the transition from the state-developed…
Descriptors: Language Proficiency, English (Second Language), Second Language Learning, Feminism
Ling, Guangming; Mollaun, Pamela; Xi, Xiaoming – Language Testing, 2014
The scoring of constructed responses may introduce construct-irrelevant factors to a test score and affect its validity and fairness. Fatigue is one of the factors that could negatively affect human performance in general, yet little is known about its effects on a human rater's scoring quality on constructed responses. In this study, we compared…
Descriptors: Evaluators, Fatigue (Biology), Scoring, Performance
Lim, Gad S. – Language Testing, 2011
Raters are central to writing performance assessment, and rater development--training, experience, and expertise--involves a temporal dimension. However, few studies have examined new and experienced raters' rating performance longitudinally over multiple time points. This study uses operational data from the writing section of the MELAB (n =…
Descriptors: Expertise, Writing Evaluation, Performance Based Assessment, Writing Tests
Huang, Shu-Chen – Language Testing, 2011
This study examined two types of classroom assessment events, the more closed convergent assessments (CA) versus the more open-ended divergent assessments (DA), to see if they influence learners differently in terms of motivation and learning strategies. Participants were 105 college freshmen in Taiwan with the same instructor placed under one…
Descriptors: College Freshmen, Speech Communication, Self Efficacy, Performance Based Assessment
Jin, Tan; Mak, Barley; Zhou, Pei – Language Testing, 2012
The fuzziness of assessing second language speaking performance raises two difficulties in scoring speaking performance: "indistinction between adjacent levels" and "overlap between scales". To address these two problems, this article proposes a new approach, "confidence scoring", to deal with such fuzziness, leading to "confidence" scores between…
Descriptors: Speech Communication, Scoring, Test Interpretation, Second Language Learning
Previous Page | Next Page ยป
Pages: 1 | 2
Peer reviewed
Direct link
