ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	2
Since 2007 (last 20 years)	5

Descriptor

Evaluators	5
Performance Based Assessment	5
Scoring	5
Language Tests	4
Second Language Learning	4
Scores	3
Correlation	2
English (Second Language)	2
Evaluation Criteria	2
Profiles	2
Reliability	2
Writing Tests	2
Academic Discourse	1
Accuracy	1
Bias	1
Classification	1
Context Effect	1
Data Analysis	1
Discussion	1
English for Academic Purposes	1
Error of Measurement	1
Essays	1
Evaluation Methods	1
Foreign Countries	1
Generalizability Theory	1
More ▼

Source

Language Testing

Author

Barkaoui, Khaled	1
Eckes, Thomas	1
Janssen, Gerriet	1
Lin, Chih-Kai	1
Meier, Valerie	1
Trace, Jonathan	1
Xi, Xiaoming	1

Publication Type

Journal Articles	5
Reports - Research	5

Education Level

Audience

Location

Colombia

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…

What Works Clearinghouse Rating

Showing all 5 results Save | Export

Working with Sparse Data in Rated Language Tests: Generalizability Theory Applications

Peer reviewed

Direct link

Lin, Chih-Kai – Language Testing, 2017

Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…

Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy

Measuring the Impact of Rater Negotiation in Writing Performance Assessment

Peer reviewed

Direct link

Trace, Jonathan; Janssen, Gerriet; Meier, Valerie – Language Testing, 2017

Previous research in second language writing has shown that when scoring performance assessments even trained raters can exhibit significant differences in severity. When raters disagree, using discussion to try to reach a consensus is one popular form of score resolution, particularly in contexts with limited resources, as it does not require…

Descriptors: Performance Based Assessment, Second Language Learning, Scoring, Evaluators

Explaining ESL Essay Holistic Scores: A Multilevel Modeling Approach

Peer reviewed

Direct link

Barkaoui, Khaled – Language Testing, 2010

This study adopted a multilevel modeling (MLM) approach to examine the contribution of rater and essay factors to variability in ESL essay holistic scores. Previous research aiming to explain variability in essay holistic scores has focused on either rater or essay factors. The few studies that have examined the contribution of more than one…

Descriptors: Performance Based Assessment, English (Second Language), Second Language Learning, Holistic Approach

Rater Types in Writing Performance Assessments: A Classification Approach to Rater Variability

Peer reviewed

Direct link

Eckes, Thomas – Language Testing, 2008

Research on rater effects in language performance assessments has provided ample evidence for a considerable degree of variability among raters. Building on this research, I advance the hypothesis that experienced raters fall into types or classes that are clearly distinguishable from one another with respect to the importance they attach to…

Descriptors: Performance Based Assessment, Language Tests, Measures (Individuals), Scoring

Evaluating Analytic Scoring for the TOEFL[R] Academic Speaking Test (TAST) for Operational Use

Peer reviewed

Direct link

Xi, Xiaoming – Language Testing, 2007

This study explores the utility of analytic scoring for TAST in providing useful and reliable diagnostic information for operational use in three aspects of candidates' performance: delivery, language use and topic development. One hundred and forty examinees' responses to six TAST tasks were scored analytically on these three aspects of speech. G…

Descriptors: Scoring, Profiles, Performance Based Assessment, Academic Discourse