ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	6
Since 2016 (last 10 years)	21
Since 2006 (last 20 years)	32

Descriptor

Evaluators	41
Interrater Reliability	41
Language Tests	41
English (Second Language)	28
Second Language Learning	25
Oral Language	20
Foreign Countries	17
Language Proficiency	17
Scores	15
Scoring	15
Computer Assisted Testing	10
Evaluation Methods	10
Comparative Analysis	8
Second Language Instruction	8
Statistical Analysis	7
Accuracy	6
Correlation	6
Rating Scales	6
Scoring Rubrics	6
Speech Communication	6
Test Items	6
Pronunciation	5
Test Validity	5
Essays	4
Evaluation Criteria	4
More ▼

Source

Language Testing	8
ETS Research Report Series	5
Language Assessment Quarterly	4
English Language Teaching	2
Studies in Second Language…	2
Advances in Language and…	1
Canadian Modern Language…	1
International Journal of…	1
JALT CALL Journal	1
Journal of Pan-Pacific…	1
Journal of Speech, Language,…	1
Language Education &…	1
Language Testing in Asia	1
New Horizons in Education	1
ProQuest LLC	1
SAGE Open	1
Taiwan Journal of TESOL	1
More ▼

Publication Type

Reports - Research	37
Journal Articles	32
Speeches/Meeting Papers	6
Tests/Questionnaires	6
Information Analyses	2
Reports - Evaluative	2
Dissertations/Theses -…	1
Guides - Non-Classroom	1

Education Level

Higher Education	9
Postsecondary Education	9
Secondary Education	4
Adult Education	2
High Schools	2
Early Childhood Education	1
Elementary Education	1
Grade 2	1
Primary Education	1

Audience

Practitioners

Location

Iran	4
China	2
Netherlands	2
California	1
Canada	1
Europe	1
Finland	1
Germany	1
Hong Kong	1
India	1
Japan	1
Japan (Tokyo)	1
South Korea	1
Switzerland	1
Turkey (Istanbul)	1
United Kingdom	1
Vietnam	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	12
International English…	2
ACTFL Oral Proficiency…	1
Alabama High School…	1
Modern Language Aptitude Test	1
Test of English for…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 41 results Save | Export

Communal Factors in Rater Severity and Consistency over Time in High-Stakes Oral Assessment

Peer reviewed

Direct link

Reeta Neittaanmäki; Iasonas Lamprianou – Language Testing, 2024

This article focuses on rater severity and consistency and their relation to major changes in the rating system in a high-stakes testing context. The study is based on longitudinal data collected from 2009 to 2019 from the second language (L2) Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. We investigated…

Descriptors: Foreign Countries, Interrater Reliability, Evaluators, Item Response Theory

Investigating the Effect of Classroom-Based Feedback on Speaking Assessment: A Multifaceted Rasch Analysis

Peer reviewed

Direct link

Bijani, Houman; Hashempour, Bahareh; Ibrahim, Khaled Ahmed Abdel-Al; Orabah, Salim Said Bani; Heydarnejad, Tahereh – Language Testing in Asia, 2022

Due to subjectivity in oral assessment, much concentration has been put on obtaining a satisfactory measure of consistency among raters. However, the process for obtaining more consistency might not result in valid decisions. One matter that is at the core of both reliability and validity in oral assessment is rater training. Recently,…

Descriptors: Oral Language, Language Tests, Feedback (Response), Bias

Measurement Properties of a Standardized Elicited Imitation Test: An Integrative Data Analysis

Peer reviewed

Direct link

Isbell, Daniel R.; Son, Young-A – Studies in Second Language Acquisition, 2022

Elicited Imitation Tests (EITs) are commonly used in second language acquisition (SLA)/bilingualism research contexts to assess the general oral proficiency of study participants. While previous studies have provided valuable EIT construct-related validity evidence, some key gaps remain. This study uses an integrative data analysis to further…

Descriptors: Bilingualism, Imitation, Language Tests, Second Language Learning

Fairness in Oral Language Assessment: Training Raters and Considering Examinees' Expectations

Peer reviewed
PDF on ERIC

Download full text

Doosti, Mehdi; Ahmadi Safa, Mohammad – International Journal of Language Testing, 2021

This study examined the effect of rater training on promoting inter-rater reliability in oral language assessment. It also investigated whether rater training and the consideration of the examinees' expectations by the examiners have any effect on test-takers' perceptions of being fairly evaluated. To this end, four raters scored 31 Iranian…

Descriptors: Oral Language, Language Tests, Interrater Reliability, Training

A Systematic Review of Methods for Evaluating Rating Quality in Language Assessment

Peer reviewed

Direct link

Wind, Stefanie A.; Peterson, Meghan E. – Language Testing, 2018

The use of assessments that require rater judgment (i.e., rater-mediated assessments) has become increasingly popular in high-stakes language assessments worldwide. Using a systematic literature review, the purpose of this study is to identify and explore the dominant methods for evaluating rating quality within the context of research on…

Descriptors: Language Tests, Evaluators, Evaluation Methods, Interrater Reliability

The Processes of Rating L2 Speaking Performance Using an Analytic Rating Scale -- A Qualitative Exploration

Peer reviewed
PDF on ERIC

Download full text

Thai, Thuy; Sheehan, Susan – Language Education & Assessment, 2022

In language performance tests, raters are important as their scoring decisions determine which aspects of performance the scores represent; however, raters are considered as one of the potential sources contributing to unwanted variability in scores (Davis, 2012). Although a great number of studies have been conducted to unpack how rater…

Descriptors: Rating Scales, Speech Communication, Second Language Learning, Second Language Instruction

Identifying Language Disorder in Bilingual Children Using Automatic Speech Recognition

Peer reviewed

Direct link

Albudoor, Nahar; Peña, Elizabeth D. – Journal of Speech, Language, and Hearing Research, 2022

Purpose: The differential diagnosis of developmental language disorder (DLD) in bilingual children represents a unique challenge due to their distributed language exposure and knowledge. The current evidence indicates that dual-language testing yields the most accurate classification of DLD among bilinguals, but there are limited personnel and…

Descriptors: Language Impairments, Bilingualism, Clinical Diagnosis, Language Tests

Rater Dominance in Discussion as a Resolution Method

Peer reviewed
PDF on ERIC

Download full text

Ahmadi, Alireza – Taiwan Journal of TESOL, 2020

Rater subjectivity has long been an intriguing topic. The use of discussion as a resolution method is a practical way to reduce this subjectivity. However, the efficacy of discussion depends on whether different raters get equally engaged in it or one rater tends to dominate others. This study investigated whether and how rater dominance occurs in…

Descriptors: Evaluators, Interrater Reliability, Discussion, Discourse Analysis

Professional and Non-Professional Raters' Responsiveness to Fluency and Accuracy in L2 Speech: An Experimental Approach

Peer reviewed

Direct link

Duijm, Klaartje; Schoonen, Rob; Hulstijn, Jan H. – Language Testing, 2018

It is general practice to use rater judgments in speaking proficiency testing. However, it has been shown that raters' knowledge and experience may influence their ratings, both in terms of leniency and varied focus on different aspects of speech. The purpose of this study is to identify raters' relative responsiveness to fluency and linguistic…

Descriptors: Language Fluency, Accuracy, Second Languages, Language Tests

Automated Essay Scoring at Scale: A Case Study in Switzerland and Germany. TOEFL® Research Report. RR-86. ETS RR-19-12

Peer reviewed
PDF on ERIC

Download full text

Rupp, André A.; Casabianca, Jodi M.; Krüger, Maleika; Keller, Stefan; Köller, Olaf – ETS Research Report Series, 2019

In this research report, we describe the design and empirical findings for a large-scale study of essay writing ability with approximately 2,500 high school students in Germany and Switzerland on the basis of 2 tasks with 2 associated prompts, each from a standardized writing assessment whose scoring involved both human and automated components.…

Descriptors: Automation, Foreign Countries, English (Second Language), Language Tests

For a Greater Good: Bias Analysis in Writing Assessment

Peer reviewed

Direct link

Ahmadi Shirazi, Masoumeh – SAGE Open, 2019

Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters' experience, language background, and the choice of essay prompt as potential…

Descriptors: Foreign Countries, Language Tests, Test Bias, Essay Tests

Comparison of Automatic and Expert Teachers' Rating of Computerized English Listening-Speaking Test

Peer reviewed
PDF on ERIC

Download full text

Linlin, Cao – English Language Teaching, 2020

Through Many-Facet Rasch analysis, this study explores the rating differences between 1 computer automatic rater and 5 expert teacher raters on scoring 119 students in a computerized English listening-speaking test. Results indicate that both automatic and the teacher raters demonstrate good inter-rater reliability, though the automatic rater…

Descriptors: Language Tests, Computer Assisted Testing, English (Second Language), Second Language Learning

Working with Sparse Data in Rated Language Tests: Generalizability Theory Applications

Peer reviewed

Direct link

Lin, Chih-Kai – Language Testing, 2017

Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…

Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy

The Effect of Training and Rater Differences on Oral Proficiency Assessment

Peer reviewed

Direct link

Kang, Okim; Rubin, Don; Kermad, Alyssa – Language Testing, 2019

As a result of the fact that judgments of non-native speech are closely tied to social biases, oral proficiency ratings are susceptible to error because of rater background and social attitudes. In the present study we seek first to estimate the variance attributable to rater background and attitudinal variables on novice raters' assessments of L2…

Descriptors: Evaluators, Second Language Learning, Language Tests, English (Second Language)

Assessing Individual and Group Oral Exams: Scoring Criteria and Rater Interaction

Peer reviewed
PDF on ERIC

Download full text

Yalçin-Çolakoglu, Özlem; Selçuk, Merve – Advances in Language and Literary Studies, 2019

Criterion referenced tests of second language speaking performance are administered in different institutions using different procedures. The present study reports raters' practices of second language speaking tests, in particular the correspondence between test-takers' grades when assessed individually and in groups. Data derived from…

Descriptors: Oral Language, Language Tests, Test Validity, Inferences

Previous Page | Next Page »

Pages: 1 | 2 | 3

Ahmadi, Alireza	2
Bejar, Isaac I.	2
Ahmadi Safa, Mohammad	1
Ahmadi Shirazi, Masoumeh	1
Albudoor, Nahar	1
Bijani, Houman	1
Bogorevich, Valeriia	1
Breyer, F. Jay	1
Carey, Michael D.	1
Casabianca, Jodi M.	1
Clevinger, Amanda	1
Coniam, David	1
Crossley, Scott	1
Davis, Larry	1
Doosti, Mehdi	1
Duijm, Klaartje	1
Dunn, Peter K.	1
Halpin, Glennelle	1
Hashempour, Bahareh	1
Heidari, Jamshid	1
Hemat, Ramin	1
Heydarnejad, Tahereh	1
Hulstijn, Jan H.	1
Iasonas Lamprianou	1
Ibrahim, Khaled Ahmed Abdel-Al	1
More ▼