ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	6
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	11

Descriptor

Difficulty Level	12
Evaluators	12
Language Tests	12
Second Language Learning	8
English (Second Language)	5
Item Analysis	5
Scores	5
Test Items	5
Oral Language	4
Correlation	3
Foreign Countries	3
Interrater Reliability	3
Item Response Theory	3
Language Fluency	3
Language Proficiency	3
Pronunciation	3
Rating Scales	3
Scoring	3
Second Language Instruction	3
Comparative Analysis	2
Computer Software	2
Evaluation Criteria	2
Korean	2
Performance Based Assessment	2
Speech Communication	2
More ▼

Source

ProQuest LLC	2
Applied Language Learning	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
Language Assessment Quarterly	1
Language Testing	1
Language Testing in Asia	1
Measurement:…	1
Studies in Second Language…	1

Publication Type

Journal Articles	9
Reports - Research	8
Dissertations/Theses -…	2
Tests/Questionnaires	2
Information Analyses	1
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Higher Education	4
Postsecondary Education	4

Audience

Location

California	1
Europe	1
Iran	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	3
Test of English for…	1

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Examining the Effect of Item Difficulty and Rater Leniency on Iranian Test Takers' Performance on WDCT and DSAT: A Comparative Study

Peer reviewed
PDF on ERIC

Download full text

Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025

The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…

Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction

Detecting Rater Centrality Effects in Performance Assessments: A Model-Based Comparison of Centrality Indices

Peer reviewed

Direct link

Jin, Kuan-Yu; Eckes, Thomas – Measurement: Interdisciplinary Research and Perspectives, 2022

Recent research on rater effects in performance assessments has increasingly focused on rater centrality, the tendency to assign scores clustering around the rating scale's middle categories. In the present paper, we adopted Jin and Wang's (2018) extended facets modeling approach and constructed a centrality continuum, ranging from raters…

Descriptors: Performance Based Assessment, Evaluators, Scoring, Sample Size

Operationalizing the Reading-into-Writing Construct in Analytic Rating Scales: Effects of Different Approaches on Rating

Peer reviewed

Direct link

Lestari, Santi B.; Brunfaut, Tineke – Language Testing, 2023

Assessing integrated reading-into-writing task performances is known to be challenging, and analytic rating scales have been found to better facilitate the scoring of these performances than other common types of rating scales. However, little is known about how specific operationalizations of the reading-into-writing construct in analytic rating…

Descriptors: Reading Writing Relationship, Writing Tests, Rating Scales, Writing Processes

Measurement Properties of a Standardized Elicited Imitation Test: An Integrative Data Analysis

Peer reviewed

Direct link

Isbell, Daniel R.; Son, Young-A – Studies in Second Language Acquisition, 2022

Elicited Imitation Tests (EITs) are commonly used in second language acquisition (SLA)/bilingualism research contexts to assess the general oral proficiency of study participants. While previous studies have provided valuable EIT construct-related validity evidence, some key gaps remain. This study uses an integrative data analysis to further…

Descriptors: Bilingualism, Imitation, Language Tests, Second Language Learning

Linking the International English Language Competency Assessment Suite of Examinations to the Common European Framework of Reference

Peer reviewed

Direct link

Hidri, Sahbi – Language Testing in Asia, 2021

The study investigated the alignment process of the International English Language Competency Assessment (IELCA) suite examinations' four levels, B1, B2, C1 and C2, onto the Common European Framework of Reference (CEFR) by explaining and discussing the five linking stages (Council of Europe (CoE 2009). Unlike previous studies, this study used the…

Descriptors: Literacy, Second Language Learning, Second Language Instruction, English (Second Language)

The Effects of Task Complexity on Comprehensibility in Second Language Speech

Peer reviewed

Direct link

Choi, Jin Soo – Applied Language Learning, 2021

This study examined the impact of the manipulated task complexity (Robinson 2001a, 2001b, 2007, 2011; Robinson & Gilabert, 2007) on second language (L2) speech comprehensibility. I examined whether manipulated task complexity (a) impacts L2 speech comprehensibility, (b) aligns with L2 speakers' perception of task difficulty (cognitive…

Descriptors: Task Analysis, Second Language Learning, Second Language Instruction, Pronunciation

The Effect of Task Complexity on Rater Severity in an Adaptive Performance-Based Second Language Oral Communication Test

Direct link

Won, Yongkook – ProQuest LLC, 2019

Despite the benefits of performance-based oral communication tests, a plethora of variables, as illustrated in Ockey and Li's (2015) model of oral communication assessment, can create construct-irrelevant variance in test scores. In relation to human participants in the oral communication tests, previous studies mostly focused on the direct effect…

Descriptors: Oral Language, Language Tests, English (Second Language), Second Language Learning

Effects of Strength of Accent on an L2 Interactive Lecture Listening Comprehension Test

Peer reviewed

Direct link

Ockey, Gary J.; Papageorgiou, Spiros; French, Robert – International Journal of Listening, 2016

This article reports on a study which aimed to determine the effect of strength of accent on listening comprehension of interactive lectures. Test takers (N = 21,726) listened to an interactive lecture given by one of nine speakers and responded to six comprehension items. The test taker responses were analyzed with the Rasch computer program…

Descriptors: Pronunciation, Listening Comprehension, Lecture Method, Computer Software

Investigating Prompt Difficulty in an Automatically Scored Speaking Performance Assessment

Direct link

Cox, Troy L. – ProQuest LLC, 2013

Speaking assessments for second language learners have traditionally been expensive to administer because of the cost of rating the speech samples. To reduce the cost, many researchers are investigating the potential of using automatic speech recognition (ASR) as a means to score examinee responses to open-ended prompts. This study examined the…

Descriptors: Cues, Second Language Learning, English Language Learners, Language Tests

The Role of Lexical Properties and Cohesive Devices in Text Integration and Their Effect on Human Ratings of Speaking Proficiency

Peer reviewed

Direct link

Crossley, Scott; Clevinger, Amanda; Kim, YouJin – Language Assessment Quarterly, 2014

There has been a growing interest in the use of integrated tasks in the field of second language testing to enhance the authenticity of language tests. However, the role of text integration in test takers' performance has not been widely investigated. The purpose of the current study is to examine the effects of text-based relational (i.e.,…

Descriptors: Language Proficiency, Connected Discourse, Language Tests, English (Second Language)

High Stakes Tests with Self-Selected Essay Questions: Addressing Issues of Fairness

Peer reviewed

Direct link

Lamprianou, Iasonas – International Journal of Testing, 2008

This study investigates the effect of reporting the unadjusted raw scores in a high-stakes language exam when raters differ significantly in severity and self-selected questions differ significantly in difficulty. More sophisticated models, introducing meaningful facets and parameters, are successively used to investigate the characteristics of…

Descriptors: High Stakes Tests, Raw Scores, Item Response Theory, Language Tests

The Relationship between Modified Angoff Knowledge Estimation Judgments and Item Difficulty Values for Seven NTE Specialty Area Tests.

Wheeler, Patricia – 1991

The appropriateness of the Angoff method (W. H. Angoff, 1971) for setting standards on tests was studied. Evaluators (judges) from California school districts and teacher training institutions reviewed 15 NTE (National Teacher Examinations) Program Specialty Area Tests published by the Educational Testing Service for their appropriateness in…

Descriptors: Art Education, Biology, Difficulty Level, Elementary Secondary Education

Brunfaut, Tineke	1
Choi, Jin Soo	1
Clevinger, Amanda	1
Cox, Troy L.	1
Crossley, Scott	1
Eckes, Thomas	1
French, Robert	1
Golam Reza Rohani	1
Hamdollah Ravand	1
Hidri, Sahbi	1
Isbell, Daniel R.	1
Jin, Kuan-Yu	1
Kim, YouJin	1
Lamprianou, Iasonas	1
Lestari, Santi B.	1
Ockey, Gary J.	1
Papageorgiou, Spiros	1
Reza Shahi	1
Son, Young-A	1
Wheeler, Patricia	1
Won, Yongkook	1
More ▼