NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 8 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A. – Language Testing, 2023
Researchers frequently evaluate rater judgments in performance assessments for evidence of differential rater functioning (DRF), which occurs when rater severity is systematically related to construct-irrelevant student characteristics after controlling for student achievement levels. However, researchers have observed that methods for detecting…
Descriptors: Evaluators, Decision Making, Student Characteristics, Performance Based Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Lin, Chih-Kai – Language Testing, 2017
Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…
Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Trace, Jonathan; Janssen, Gerriet; Meier, Valerie – Language Testing, 2017
Previous research in second language writing has shown that when scoring performance assessments even trained raters can exhibit significant differences in severity. When raters disagree, using discussion to try to reach a consensus is one popular form of score resolution, particularly in contexts with limited resources, as it does not require…
Descriptors: Performance Based Assessment, Second Language Learning, Scoring, Evaluators
Peer reviewed Peer reviewed
Direct linkDirect link
Lim, Gad S. – Language Testing, 2011
Raters are central to writing performance assessment, and rater development--training, experience, and expertise--involves a temporal dimension. However, few studies have examined new and experienced raters' rating performance longitudinally over multiple time points. This study uses operational data from the writing section of the MELAB (n =…
Descriptors: Expertise, Writing Evaluation, Performance Based Assessment, Writing Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Barkaoui, Khaled – Language Testing, 2010
This study adopted a multilevel modeling (MLM) approach to examine the contribution of rater and essay factors to variability in ESL essay holistic scores. Previous research aiming to explain variability in essay holistic scores has focused on either rater or essay factors. The few studies that have examined the contribution of more than one…
Descriptors: Performance Based Assessment, English (Second Language), Second Language Learning, Holistic Approach
Peer reviewed Peer reviewed
Direct linkDirect link
Johnson, Jeff S.; Lim, Gad S. – Language Testing, 2009
Language performance assessments typically require human raters, introducing possible error. In international examinations of English proficiency, rater language background is an especially salient factor that needs to be considered. The existence of rater language background-related bias in writing performance assessment is the object of this…
Descriptors: Performance Based Assessment, Performance Tests, Native Speakers, English (Second Language)
Peer reviewed Peer reviewed
Direct linkDirect link
Eckes, Thomas – Language Testing, 2008
Research on rater effects in language performance assessments has provided ample evidence for a considerable degree of variability among raters. Building on this research, I advance the hypothesis that experienced raters fall into types or classes that are clearly distinguishable from one another with respect to the importance they attach to…
Descriptors: Performance Based Assessment, Language Tests, Measures (Individuals), Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Xi, Xiaoming – Language Testing, 2007
This study explores the utility of analytic scoring for TAST in providing useful and reliable diagnostic information for operational use in three aspects of candidates' performance: delivery, language use and topic development. One hundred and forty examinees' responses to six TAST tasks were scored analytically on these three aspects of speech. G…
Descriptors: Scoring, Profiles, Performance Based Assessment, Academic Discourse