ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	3
Since 2007 (last 20 years)	8

Descriptor

Evaluators	8
Performance Based Assessment	8
Language Tests	7
Second Language Learning	6
Scoring	5
English (Second Language)	4
Writing Tests	4
Scores	3
Writing Evaluation	3
Accuracy	2
Bias	2
Correlation	2
Evaluation Criteria	2
Item Response Theory	2
Language Proficiency	2
Performance Tests	2
Profiles	2
Reliability	2
Academic Achievement	1
Academic Discourse	1
Case Studies	1
Certification	1
Chinese	1
Classification	1
Context Effect	1
More ▼

Source

Language Testing

Author

Lim, Gad S.	2
Barkaoui, Khaled	1
Eckes, Thomas	1
Janssen, Gerriet	1
Johnson, Jeff S.	1
Lin, Chih-Kai	1
Meier, Valerie	1
Trace, Jonathan	1
Wind, Stefanie A.	1
Xi, Xiaoming	1

Publication Type

Journal Articles	8
Reports - Research	8

Education Level

High Schools	1
Secondary Education	1

Audience

Location

Colombia

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…

What Works Clearinghouse Rating

Showing all 8 results Save | Export

A Sequential Approach to Detecting Differential Rater Functioning in Sparse Rater-Mediated Assessment Networks

Peer reviewed

Direct link

Wind, Stefanie A. – Language Testing, 2023

Researchers frequently evaluate rater judgments in performance assessments for evidence of differential rater functioning (DRF), which occurs when rater severity is systematically related to construct-irrelevant student characteristics after controlling for student achievement levels. However, researchers have observed that methods for detecting…

Descriptors: Evaluators, Decision Making, Student Characteristics, Performance Based Assessment

Working with Sparse Data in Rated Language Tests: Generalizability Theory Applications

Peer reviewed

Direct link

Lin, Chih-Kai – Language Testing, 2017

Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…

Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy

Measuring the Impact of Rater Negotiation in Writing Performance Assessment

Peer reviewed

Direct link

Trace, Jonathan; Janssen, Gerriet; Meier, Valerie – Language Testing, 2017

Previous research in second language writing has shown that when scoring performance assessments even trained raters can exhibit significant differences in severity. When raters disagree, using discussion to try to reach a consensus is one popular form of score resolution, particularly in contexts with limited resources, as it does not require…

Descriptors: Performance Based Assessment, Second Language Learning, Scoring, Evaluators

The Development and Maintenance of Rating Quality in Performance Writing Assessment: A Longitudinal Study of New and Experienced Raters

Peer reviewed

Direct link

Lim, Gad S. – Language Testing, 2011

Raters are central to writing performance assessment, and rater development--training, experience, and expertise--involves a temporal dimension. However, few studies have examined new and experienced raters' rating performance longitudinally over multiple time points. This study uses operational data from the writing section of the MELAB (n =…

Descriptors: Expertise, Writing Evaluation, Performance Based Assessment, Writing Tests

Explaining ESL Essay Holistic Scores: A Multilevel Modeling Approach

Peer reviewed

Direct link

Barkaoui, Khaled – Language Testing, 2010

This study adopted a multilevel modeling (MLM) approach to examine the contribution of rater and essay factors to variability in ESL essay holistic scores. Previous research aiming to explain variability in essay holistic scores has focused on either rater or essay factors. The few studies that have examined the contribution of more than one…

Descriptors: Performance Based Assessment, English (Second Language), Second Language Learning, Holistic Approach

The Influence of Rater Language Background on Writing Performance Assessment

Peer reviewed

Direct link

Johnson, Jeff S.; Lim, Gad S. – Language Testing, 2009

Language performance assessments typically require human raters, introducing possible error. In international examinations of English proficiency, rater language background is an especially salient factor that needs to be considered. The existence of rater language background-related bias in writing performance assessment is the object of this…

Descriptors: Performance Based Assessment, Performance Tests, Native Speakers, English (Second Language)

Rater Types in Writing Performance Assessments: A Classification Approach to Rater Variability

Peer reviewed

Direct link

Eckes, Thomas – Language Testing, 2008

Research on rater effects in language performance assessments has provided ample evidence for a considerable degree of variability among raters. Building on this research, I advance the hypothesis that experienced raters fall into types or classes that are clearly distinguishable from one another with respect to the importance they attach to…

Descriptors: Performance Based Assessment, Language Tests, Measures (Individuals), Scoring

Evaluating Analytic Scoring for the TOEFL[R] Academic Speaking Test (TAST) for Operational Use

Peer reviewed

Direct link

Xi, Xiaoming – Language Testing, 2007

This study explores the utility of analytic scoring for TAST in providing useful and reliable diagnostic information for operational use in three aspects of candidates' performance: delivery, language use and topic development. One hundred and forty examinees' responses to six TAST tasks were scored analytically on these three aspects of speech. G…

Descriptors: Scoring, Profiles, Performance Based Assessment, Academic Discourse