ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	13

Descriptor

Foreign Countries	14
Interrater Reliability	14
Language Tests	8
Second Language Learning	7
English (Second Language)	6
Item Response Theory	5
Correlation	4
Models	4
Rating Scales	4
Evaluators	3
Expertise	3
High Stakes Tests	3
Indo European Languages	3
Language Proficiency	3
Scoring	3
Secondary School Students	3
Writing Evaluation	3
Accuracy	2
Achievement Tests	2
Error of Measurement	2
Finno Ugric Languages	2
Language Fluency	2
Language Teachers	2
Language Usage	2
Novices	2
More ▼

Source

Language Testing

Publication Type

Journal Articles	14
Reports - Research	11
Reports - Evaluative	3
Tests/Questionnaires	1

Education Level

Higher Education	5
Postsecondary Education	3
Secondary Education	3
Elementary Education	2
Adult Education	1
Early Childhood Education	1
Elementary Secondary Education	1
Grade 6	1
Intermediate Grades	1
Kindergarten	1
Primary Education	1
More ▼

Audience

Location

Netherlands	5
Finland	3
Japan	2
Europe	1
Hong Kong	1
Sweden	1
Taiwan	1

Laws, Policies, & Programs

Assessments and Surveys

Peabody Picture Vocabulary…

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Communal Factors in Rater Severity and Consistency over Time in High-Stakes Oral Assessment

Peer reviewed

Direct link

Reeta Neittaanmäki; Iasonas Lamprianou – Language Testing, 2024

This article focuses on rater severity and consistency and their relation to major changes in the rating system in a high-stakes testing context. The study is based on longitudinal data collected from 2009 to 2019 from the second language (L2) Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. We investigated…

Descriptors: Foreign Countries, Interrater Reliability, Evaluators, Item Response Theory

All Types of Experience Are Equal, but Some Are More Equal: The Effect of Different Types of Experience on Rater Severity and Rater Consistency

Peer reviewed

Direct link

Reeta Neittaanmäki; Iasonas Lamprianou – Language Testing, 2024

This article focuses on rater severity and consistency and their relation to different types of rater experience over a long period of time. The article is based on longitudinal data collected from 2009 to 2019 from the second language Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. The study investigated…

Descriptors: Foreign Countries, Interrater Reliability, Error of Measurement, Experience

The Longitudinal Stability of Rating Characteristics in an EFL Examination: Methodological and Substantive Considerations

Peer reviewed

Direct link

Lamprianou, Iasonas; Tsagari, Dina; Kyriakou, Nansia – Language Testing, 2021

This longitudinal study (2002-2014) investigates the stability of rating characteristics of a large group of raters over time in the context of the writing paper of a national high-stakes examination. The study uses one measure of rater severity and two measures of rater consistency. The results suggest that the rating characteristics of…

Descriptors: Longitudinal Studies, Evaluators, High Stakes Tests, Writing Evaluation

Professional and Non-Professional Raters' Responsiveness to Fluency and Accuracy in L2 Speech: An Experimental Approach

Peer reviewed

Direct link

Duijm, Klaartje; Schoonen, Rob; Hulstijn, Jan H. – Language Testing, 2018

It is general practice to use rater judgments in speaking proficiency testing. However, it has been shown that raters' knowledge and experience may influence their ratings, both in terms of leniency and varied focus on different aspects of speech. The purpose of this study is to identify raters' relative responsiveness to fluency and linguistic…

Descriptors: Language Fluency, Accuracy, Second Languages, Language Tests

Development and Validation of a Chinese Character Acquisition Assessment for Second-Language Kindergarteners

Peer reviewed

Direct link

Chan, Stephanie W. Y.; Cheung, Wai Ming; Huang, Yanli; Lam, Wai-Ip; Lin, Chin-Hsi – Language Testing, 2020

Demand for second-language (L2) Chinese education for kindergarteners has grown rapidly, but little is known about these kindergarteners' L2 skills, with existing studies focusing on school-age populations and alphabetic languages. Accordingly, we developed a six-subtest Chinese character acquisition assessment to measure L2 kindergarteners'…

Descriptors: Chinese, Second Language Learning, Second Language Instruction, Written Language

Measuring L2 Speakers' Interactional Ability Using Interactive Speech Tasks

Peer reviewed

Direct link

van Batenburg, Eline S. L.; Oostdam, Ron J.; van Gelderen, Amos J. S.; de Jong, Nivja H. – Language Testing, 2018

This article explores ways to assess interactional performance, and reports on the use of a test format that standardizes the interlocutor's linguistic and interactional contributions to the exchange. It describes the construction and administration of six scripted speech tasks (instruction, advice, and sales tasks) with pre-vocational learners (n…

Descriptors: Second Language Learning, Speech Tests, Interaction, Test Reliability

Determining the Scoring Validity of a Co-Constructed CEFR-Based Rating Scale

Peer reviewed

Direct link

Deygers, Bart; Van Gorp, Koen – Language Testing, 2015

Considering scoring validity as encompassing both reliable rating scale use and valid descriptor interpretation, this study reports on the validation of a CEFR-based scale that was co-constructed and used by novice raters. The research questions this paper wishes to answer are (a) whether it is possible to construct a CEFR-based rating scale with…

Descriptors: Rating Scales, Scoring, Validity, Interrater Reliability

An Application of Multifaceted Rasch Measurement in the Yes/No Angoff Standard Setting Procedure

Peer reviewed

Direct link

Hsieh, Mingchuan – Language Testing, 2013

When implementing standard setting procedures, there are two major concerns: variance between panelists and efficiency in conducting multiple rounds of judgments. With regard to the former, there is concern over the consistency of the cutoff scores made by different panelists. If the cut scores show an inordinately wide range then further rounds…

Descriptors: Item Response Theory, Standard Setting (Scoring), Language Tests, English (Second Language)

SLA Developmental Stages and Teachers' Assessment of Written French: Exploring Direkt Profil as a Diagnostic Assessment Tool

Peer reviewed

Direct link

Granfeldt, Jonas; Ågren, Malin – Language Testing, 2014

One core area of research in Second Language Acquisition is the identification and definition of developmental stages in different L2s. For L2 French, Bartning and Schlyter (2004) presented a model of six morphosyntactic stages of development in the shape of grammatical profiles. The model formed the basis for the computer program Direkt Profil…

Descriptors: Second Language Learning, Language Tests, French, Language Teachers

Assessing Learners' Writing Skills in a SLA Study: Validating the Rating Process across Tasks, Scales and Languages

Peer reviewed

Direct link

Huhta, Ari; Alanen, Riikka; Tarnanen, Mirja; Martin, Maisa; Hirvelä, Tuija – Language Testing, 2014

There is still relatively little research on how well the CEFR and similar holistic scales work when they are used to rate L2 texts. Using both multifaceted Rasch analyses and qualitative data from rater comments and interviews, the ratings obtained by using a CEFR-based writing scale and the Finnish National Core Curriculum scale for L2 writing…

Descriptors: Foreign Countries, Writing Skills, Second Language Learning, Finno Ugric Languages

Native Speakers' Perceptions of Fluency and Accent in L2 Speech

Peer reviewed

Direct link

Pinget, Anne-France; Bosker, Hans Rutger; Quené, Hugo; de Jong, Nivja H. – Language Testing, 2014

Oral fluency and foreign accent distinguish L2 from L1 speech production. In language testing practices, both fluency and accent are usually assessed by raters. This study investigates what exactly native raters of fluency and accent take into account when judging L2. Our aim is to explore the relationship between objectively measured temporal,…

Descriptors: Native Speakers, Language Fluency, Suprasegmentals, Second Language Learning

Rater Bias Patterns in an EFL Writing Assessment

Peer reviewed

Direct link

Schaefer, Edward – Language Testing, 2008

The present study employed multi-faceted Rasch measurement (MFRM) to explore the rater bias patterns of native English-speaker (NES) raters when they rate EFL essays. Forty NES raters rated 40 essays written by female Japanese university students on a single topic adapted from the TOEFL Test of Written English (TWE). The essays were assessed using…

Descriptors: Writing Evaluation, Writing Tests, Program Effectiveness, Essays

The Assessment of Writing Ability: Expert Readers versus Lay Readers.

Peer reviewed

Schoonen, Rob; And Others – Language Testing, 1997

Reports on three studies conducted in the Netherlands about the reading reliability of lay and expert readers in rating content and language usage of students' writing performances in three kinds of writing assignments. Findings reveal that expert readers are more reliable in rating usage, whereas both lay and expert readers are reliable raters of…

Descriptors: Foreign Countries, Interrater Reliability, Language Usage, Models

Validity Evidence in a University Group Oral Test

Peer reviewed

Direct link

Van Moere, Alistair – Language Testing, 2006

This article investigates a group oral test as administered at a university in Japan to find if it is appropriate to use scores for higher stakes decision making. It is one component of an in-house English proficiency test used for placing students, evaluating their progress, and making informed decisions for the development of the English…

Descriptors: Foreign Countries, Generalizability Theory, Achievement Tests, English (Second Language)

Iasonas Lamprianou	2
Reeta Neittaanmäki	2
Schoonen, Rob	2
de Jong, Nivja H.	2
Alanen, Riikka	1
Bosker, Hans Rutger	1
Chan, Stephanie W. Y.	1
Cheung, Wai Ming	1
Deygers, Bart	1
Duijm, Klaartje	1
Granfeldt, Jonas	1
Hirvelä, Tuija	1
Hsieh, Mingchuan	1
Huang, Yanli	1
Huhta, Ari	1
Hulstijn, Jan H.	1
Kyriakou, Nansia	1
Lam, Wai-Ip	1
Lamprianou, Iasonas	1
Lin, Chin-Hsi	1
Martin, Maisa	1
Oostdam, Ron J.	1
Pinget, Anne-France	1
Quené, Hugo	1
More ▼