ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	15
Since 2007 (last 20 years)	47

Descriptor

Statistical Analysis	54
Language Tests	44
Second Language Learning	38
English (Second Language)	32
Foreign Countries	19
Comparative Analysis	18
Language Proficiency	16
Correlation	15
Scores	15
Testing	12
Oral Language	11
Test Items	11
College Students	8
Evaluators	8
Item Response Theory	8
Reading Comprehension	8
Second Language Instruction	8
Test Validity	8
Computer Assisted Testing	7
Native Speakers	7
Scoring	7
Secondary School Students	7
Difficulty Level	6
Elementary School Students	6
Interrater Reliability	6
More ▼

Source

Language Testing

Publication Type

Journal Articles	54
Reports - Research	39
Reports - Evaluative	11
Tests/Questionnaires	5
Information Analyses	3
Reports - Descriptive	2
Guides - Non-Classroom	1
Opinion Papers	1
Speeches/Meeting Papers	1

Education Level

Higher Education	16
Postsecondary Education	10
Secondary Education	6
Elementary Education	5
Elementary Secondary Education	1
Grade 5	1
High Schools	1
Intermediate Grades	1

Audience

Location

Japan	5
Australia	3
China	2
Sweden	2
United Kingdom	2
Canada	1
Chile	1
Colombia	1
Denmark	1
Ecuador	1
Georgia	1
Germany	1
Iowa	1
Israel	1
Malaysia	1
Netherlands	1
Norway	1
Ohio	1
Poland	1
Russia	1
Texas	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	6
Early Childhood Longitudinal…	1
International English…	1
Michigan Test of English…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 54 results Save | Export

Investigating the Construct Measured by Banked Gap-Fill Items: Evidence from Eye-Tracking

Peer reviewed

Direct link

McCray, Gareth; Brunfaut, Tineke – Language Testing, 2018

This study investigates test-takers' processing while completing banked gap-fill tasks, designed to test reading proficiency, in order to test theoretically based expectations about the variation in cognitive processes of test-takers across levels of performance. Twenty-eight test-takers' eye traces on 24 banked gap-fill items (on six tasks) were…

Descriptors: Language Tests, Test Items, Item Analysis, Eye Movements

National Reading Tests in Denmark, Norway, and Sweden: A Comparison of Construct Definitions, Cognitive Targets, and Response Formats

Peer reviewed

Direct link

Tengberg, Michael – Language Testing, 2017

Reading comprehension tests are often assumed to measure the same, or at least similar, constructs. Yet, reading is not a single but a multidimensional form of processing, which means that variations in terms of reading material and item design may emphasize one aspect of the construct at the cost of another. The educational systems in Denmark,…

Descriptors: Foreign Countries, National Competency Tests, Reading Tests, Comparative Analysis

A Comparison of Reliability and Precision of Subscore Reporting Methods for a State English Language Proficiency Assessment

Peer reviewed

Direct link

Longabach, Tanya; Peyton, Vicki – Language Testing, 2018

K-12 English language proficiency tests that assess multiple content domains (e.g., listening, speaking, reading, writing) often have subsections based on these content domains; scores assigned to these subsections are commonly known as subscores. Testing programs face increasing customer demands for the reporting of subscores in addition to the…

Descriptors: Comparative Analysis, Test Reliability, Second Language Learning, Language Proficiency

A Nonparametric Procedure for Exploring Differences in Rating Quality across Test-Taker Subgroups in Rater-Mediated Writing Assessments

Peer reviewed

Direct link

Wind, Stefanie A. – Language Testing, 2019

Differences in rater judgments that are systematically related to construct-irrelevant characteristics threaten the fairness of rater-mediated writing assessments. Accordingly, it is essential that researchers and practitioners examine the degree to which the psychometric quality of rater judgments is comparable across test-taker subgroups.…

Descriptors: Nonparametric Statistics, Interrater Reliability, Differences, Writing Tests

The Selection of Cognitive Diagnostic Models for a Reading Comprehension Test

Peer reviewed

Direct link

Li, Hongli; Hunter, C. Vincent; Lei, Pui-Wa – Language Testing, 2016

Cognitive diagnostic models (CDMs) have great promise for providing diagnostic information to aid learning and instruction, and a large number of CDMs have been proposed. However, the assumptions and performances of different CDMs and their applications in regard to reading comprehension tests are not fully understood. In the present study, we…

Descriptors: Reading Comprehension, Reading Tests, Models, Comparative Analysis

A Comparison of Video- and Audio-Mediated Listening Tests with Many-Facet Rasch Modeling and Differential Distractor Functioning

Peer reviewed

Direct link

Batty, Aaron Olaf – Language Testing, 2015

The rise in the affordability of quality video production equipment has resulted in increased interest in video-mediated tests of foreign language listening comprehension. Although research on such tests has continued fairly steadily since the early 1980s, studies have relied on analyses of raw scores, despite the growing prevalence of item…

Descriptors: Listening Comprehension Tests, Comparative Analysis, Video Technology, Audio Equipment

Using Corpus Linguistics to Examine the Extrapolation Inference in the Validity Argument for a High-Stakes Speaking Assessment

Peer reviewed

Direct link

LaFlair, Geoffrey T.; Staples, Shelley – Language Testing, 2017

Investigations of the validity of a number of high-stakes language assessments are conducted using an argument-based approach, which requires evidence for inferences that are critical to score interpretation (Chapelle, Enright, & Jamieson, 2008b; Kane, 2013). The current study investigates the extrapolation inference for a high-stakes test of…

Descriptors: Computational Linguistics, Language Tests, Test Validity, Inferences

Determining Cloze Item Difficulty from Item and Passage Characteristics across Different Learner Backgrounds

Peer reviewed

Direct link

Trace, Jonathan; Brown, James Dean; Janssen, Gerriet; Kozhevnikova, Liudmila – Language Testing, 2017

Cloze tests have been the subject of numerous studies regarding their function and use in both first language and second language contexts (e.g., Jonz & Oller, 1994; Watanabe & Koyama, 2008). From a validity standpoint, one area of investigation has been the extent to which cloze tests measure reading ability beyond the sentence level.…

Descriptors: Cloze Procedure, Language Tests, Test Items, Item Analysis

Measuring L2 Speakers' Interactional Ability Using Interactive Speech Tasks

Peer reviewed

Direct link

van Batenburg, Eline S. L.; Oostdam, Ron J.; van Gelderen, Amos J. S.; de Jong, Nivja H. – Language Testing, 2018

This article explores ways to assess interactional performance, and reports on the use of a test format that standardizes the interlocutor's linguistic and interactional contributions to the exchange. It describes the construction and administration of six scripted speech tasks (instruction, advice, and sales tasks) with pre-vocational learners (n…

Descriptors: Second Language Learning, Speech Tests, Interaction, Test Reliability

Young Learners' Response Processes When Taking Computerized Tasks for Speaking Assessment

Peer reviewed

Direct link

Lee, Shinhye; Winke, Paula – Language Testing, 2018

We investigated how young language learners process their responses on and perceive a computer-mediated, timed speaking test. Twenty 8-, 9-, and 10-year-old non-native English-speaking children (NNSs) and eight same-aged, native English-speaking children (NSs) completed seven computerized sample TOEFL® Primary™ speaking test tasks. We investigated…

Descriptors: Elementary School Students, Second Language Learning, Responses, Computer Assisted Testing

A Systematic Review of Methods for Evaluating Rating Quality in Language Assessment

Peer reviewed

Direct link

Wind, Stefanie A.; Peterson, Meghan E. – Language Testing, 2018

The use of assessments that require rater judgment (i.e., rater-mediated assessments) has become increasingly popular in high-stakes language assessments worldwide. Using a systematic literature review, the purpose of this study is to identify and explore the dominant methods for evaluating rating quality within the context of research on…

Descriptors: Language Tests, Evaluators, Evaluation Methods, Interrater Reliability

Adaptation of a Vocabulary Test from British Sign Language to American Sign Language

Peer reviewed

Direct link

Mann, Wolfgang; Roy, Penny; Morgan, Gary – Language Testing, 2016

This study describes the adaptation process of a vocabulary knowledge test for British Sign Language (BSL) into American Sign Language (ASL) and presents results from the first round of pilot testing with 20 deaf native ASL signers. The web-based test assesses the strength of deaf children's vocabulary knowledge by means of different mappings of…

Descriptors: Deafness, Language Skills, Vocabulary Development, American Sign Language

Elicited Imitation as a Measure of Second Language Proficiency: A Narrative Review and Meta-Analysis

Peer reviewed

Direct link

Yan, Xun; Maeda, Yukiko; Lv, Jing; Ginther, April – Language Testing, 2016

Elicited imitation (EI) has been widely used to examine second language (L2) proficiency and development and was an especially popular method in the 1970s and early 1980s. However, as the field embraced more communicative approaches to both instruction and assessment, the use of EI diminished, and the construct-related validity of EI scores as a…

Descriptors: Second Language Learning, Language Proficiency, Meta Analysis, Effect Size

Lexical Difficulty--Using Elicited Imitation to Study Child L2

Peer reviewed

Direct link

Campfield, Dorota E. – Language Testing, 2017

This paper reports a post-hoc analysis of the influence of lexical difficulty of cue sentences on performance in an elicited imitation (EI) task to assess oral production skills for 645 child L2 English learners in instructional settings. This formed part of a large-scale investigation into effectiveness of foreign language teaching in Polish…

Descriptors: Difficulty Level, Second Language Learning, Second Language Instruction, Elementary School Students

Validity Arguments for Diagnostic Assessment Using Automated Writing Evaluation

Peer reviewed

Direct link

Chapelle, Carol A.; Cotos, Elena; Lee, Jooyoung – Language Testing, 2015

Two examples demonstrate an argument-based approach to validation of diagnostic assessment using automated writing evaluation (AWE). "Criterion"®, was developed by Educational Testing Service to analyze students' papers grammatically, providing sentence-level error feedback. An interpretive argument was developed for its use as part of…

Descriptors: Diagnostic Tests, Writing Evaluation, Automation, Test Validity

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Bachman, Lyle F.	2
Bae, Jungok	2
Crossley, Scott A.	2
Kyle, Kristopher	2
McNamara, Danielle S.	2
Wind, Stefanie A.	2
Allalouf, Avi	1
Alvarez, Marta E.	1
Batty, Aaron Olaf	1
Bax, Stephen	1
Boo, Jaeyool	1
Brown, James Dean	1
Brunfaut, Tineke	1
Campfield, Dorota E.	1
Chalhoub-Deville, Micheline	1
Chapelle, Carol A.	1
Chen, Fang	1
Cheng, Junyu	1
Choi, Inn-Chull	1
Cotos, Elena	1
Crossley, Scott	1
Davies, Alan	1
Davis, Larry	1
Eckes, Thomas	1
More ▼