ERIC - Search Results

Publication Date

In 2025	2
Since 2024	4
Since 2021 (last 5 years)	7
Since 2016 (last 10 years)	13
Since 2006 (last 20 years)	20

Descriptor

Test Reliability	22
Language Tests	21
Scores	20
Second Language Learning	14
English (Second Language)	13
Language Proficiency	10
Test Validity	10
Foreign Countries	8
Comparative Analysis	5
Factor Analysis	5
Rating Scales	5
Test Format	5
Listening Comprehension Tests	4
Reading Tests	4
Test Construction	4
Computer Assisted Testing	3
High Stakes Tests	3
Item Response Theory	3
Oral Language	3
Second Language Instruction	3
Test Length	3
Test Reviews	3
Testing	3
Undergraduate Students	3
Achievement Tests	2
More ▼

Source

Language Testing

Publication Type

Journal Articles	22
Reports - Evaluative	10
Reports - Research	10
Information Analyses	1
Opinion Papers	1
Reports - Descriptive	1
Tests/Questionnaires	1

Education Level

Higher Education	3
Postsecondary Education	3
Secondary Education	3
Elementary Education	1
Junior High Schools	1
Middle Schools	1

Audience

Location

China	2
Australia	1
Germany	1
Hawaii	1
Illinois	1
Iran	1
Kenya	1
Pennsylvania (Philadelphia)	1
Taiwan	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	4
ACTFL Oral Proficiency…	1
English Proficiency Test	1

What Works Clearinghouse Rating

Showing 1 to 15 of 22 results Save | Export

Test Review: Computer-Based English Listening and Speaking Test (CELST) of National Matriculation English Test (NMET) Guangdong Version in China

Peer reviewed

Direct link

Ying Xu; Xiaodong Li; Jin Chen – Language Testing, 2025

This article provides a detailed review of the Computer-based English Listening Speaking Test (CELST) used in Guangdong, China, as part of the National Matriculation English Test (NMET) to assess students' English proficiency. The CELST measures listening and speaking skills as outlined in the "English Curriculum for Senior Middle…

Descriptors: Computer Assisted Testing, English (Second Language), Language Tests, Listening Comprehension Tests

Revisiting Rating Scale Development for Rater-Mediated Language Performance Assessments: Modelling Construct and Contextual Choices Made by Scale Developers

Peer reviewed

Direct link

Knoch, Ute; Deygers, Bart; Khamboonruang, Apichat – Language Testing, 2021

Rating scale development in the field of language assessment is often considered in dichotomous ways: It is assumed to be guided either by expert intuition or by drawing on performance data. Even though quite a few authors have argued that rating scale development is rarely so easily classifiable, this dyadic view has dominated language testing…

Descriptors: Rating Scales, Test Construction, Language Tests, Test Use

Making Each Point Count: Revising a Local Adaptation of the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE Rubric

Peer reviewed

Direct link

Yu-Tzu Chang; Ann Tai Choe; Daniel Holden; Daniel R. Isbell – Language Testing, 2024

In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016-2021, including 434 essays) using…

Descriptors: Language Tests, Rating Scales, Second Language Learning, English (Second Language)

A New Scoring Method for Item Response Theory Analysis of C-Tests

Peer reviewed

Direct link

Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025

This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…

Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction

Setting Standards for a Diagnostic Test of Aviation English for Student Pilots

Peer reviewed

Direct link

Maria Treadaway; John Read – Language Testing, 2024

Standard-setting is an essential component of test development, supporting the meaningfulness and appropriate interpretation of test scores. However, in the high-stakes testing environment of aviation, standard-setting studies are underexplored. To address this gap, we document two stages in the standard-setting procedures for the Overseas Flight…

Descriptors: Standard Setting, Diagnostic Tests, High Stakes Tests, English for Special Purposes

The Use of Generalizability Theory in Investigating the Score Dependability of Classroom-Based L2 Reading Assessment

Peer reviewed

Direct link

Liao, Ray J. T. – Language Testing, 2023

Among the variety of selected response formats used in L2 reading assessment, multiple-choice (MC) is the most commonly adopted, primarily due to its efficiency and objectiveness. Given the impact of assessment results on teaching and learning, it is necessary to investigate the degree to which the MC format reliably measures learners' L2 reading…

Descriptors: Reading Tests, Language Tests, Second Language Learning, Second Language Instruction

Measuring the Development of General Language Skills in English as a Foreign Language--Longitudinal Invariance of the C-Test

Peer reviewed

Direct link

Schnoor, Birger; Hartig, Johannes; Klinger, Thorsten; Naumann, Alexander; Usanova, Irina – Language Testing, 2023

Research on assessing English as a foreign language (EFL) development has been growing recently. However, empirical evidence from longitudinal analyses based on substantial samples is still needed. In such settings, tests for measuring language development must meet high standards of test quality such as validity, reliability, and objectivity, as…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Longitudinal Studies

Test Review: TestDaF

Peer reviewed

Direct link

Norris, John; Drackert, Anastasia – Language Testing, 2018

The Test of German as a Foreign Language (TestDaF) plays a critical role as a standardized test of German language proficiency. Developed and administered by the Society for Academic Study Preparation and Test Development (g.a.s.t.), TestDaF was launched in 2001 and has experienced persistent annual growth, with more than 44,000 test takers in…

Descriptors: German, Second Language Learning, Language Tests, Language Proficiency

ACTFL Oral Proficiency Interview -- Computer (OPIc)

Peer reviewed

Direct link

Isbell, Dan; Winke, Paula – Language Testing, 2019

The American Council on the Teaching of Foreign Languages (ACTFL) oral proficiency interview -- computer (OPIc) testing system represents an ambitious effort in language assessment: Assessing oral proficiency in over a dozen languages, on the same scale, from virtually anywhere at any time. Especially for users in contexts where multiple foreign…

Descriptors: Oral Language, Language Tests, Language Proficiency, Second Language Learning

Evaluating Subscore Uses across Multiple Levels: A Case of Reading and Listening Subscores for Young EFL Learners

Peer reviewed

Direct link

Choi, Ikkyu; Papageorgiou, Spiros – Language Testing, 2020

Stakeholders of language tests are often interested in subscores. However, reporting a subscore is not always justified; a subscore should provide reliable and distinct information to be worth reporting. When a subscore is used for decisions across multiple levels (e.g., individual test takers and schools), it needs to be justified for its…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Scores

China Accreditation Test for Translators and Interpreters (CATTI): Test Review Based on the Language Pairing of English and Chinese

Peer reviewed

Direct link

Zhao, Hulin; Gu, Xiangdong – Language Testing, 2016

Test Purpose: The CATTI aims to measure competence in translation and interpreting (including simultaneous and consecutive interpreting) between Chinese and seven foreign languages: English, Japanese, French, Arabic, Russian, German, or Spanish. The test is intended to cover a wide range of domains including business, government, academia, and…

Descriptors: Accreditation (Institutions), Foreign Countries, Translation, Chinese

Do the TOEFL iBT® Section Scores Provide Value-Added Information to Stakeholders

Peer reviewed

Direct link

Sawaki, Yasuyo; Sinharay, Sandip – Language Testing, 2018

The present study examined the reliability of the reading, listening, speaking, and writing section scores for the TOEFL iBT® test and their interrelationship in order to collect empirical evidence to support, respectively, the "generalization" inference and the "explanation" inference in the TOEFL iBT validity argument…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Computer Assisted Testing

Predicting Grades from an English Language Assessment: The Importance of Peeling the Onion

Peer reviewed

Direct link

Bridgeman, Brent; Cho, Yeonsuk; DiPietro, Stephen – Language Testing, 2016

Data from 787 international undergraduate students at an urban university in the United States were used to demonstrate the importance of separating a sample into meaningful subgroups in order to demonstrate the ability of an English language assessment to predict the first-year grade point average (GPA). For example, when all students were pooled…

Descriptors: Grade Prediction, English Curriculum, Language Tests, Undergraduate Students

Kane, Validity and Soundness

Peer reviewed

Direct link

Davies, Alan – Language Testing, 2012

In this article, the author begins by discussing four challenges on the concept of validity. These challenges are: (1) the appeal to logic and syllogistic reasoning; (2) the claim of reliability; (3) the local and the universal; and (4) the unitary and the divisible. In language testing validity cannot be achieved directly but only through a…

Descriptors: Language Tests, Test Validity, Test Reliability, Testing

The Deceptive Mean: Conceptual Scoring of Cloze Entries Differentially Advantages More Able Readers

Peer reviewed

Direct link

O'Toole, J. M.; King, R. A. R. – Language Testing, 2011

The "cloze" test is one possible investigative instrument for predicting text comprehensibility. Conceptual coding of student replacement of deleted words has been considered to be more valid than exact coding, partly because conceptual coding seemed fairer to poorer readers. This paper reports a quantitative study of 447 Australian…

Descriptors: Cloze Procedure, Test Results, Language Tests, Reading Comprehension

Previous Page | Next Page »

Pages: 1 | 2

Ann Tai Choe	1
Bridgeman, Brent	1
Cho, Yeonsuk	1
Choi, Ikkyu	1
Daniel Holden	1
Daniel R. Isbell	1
Davidson, Fred	1
Davies, Alan	1
Deygers, Bart	1
DiPietro, Stephen	1
Drackert, Anastasia	1
Esmat Babaii	1
Fairbairn, Shelley	1
Farshad Effatpanah	1
Fox, Janna	1
Gu, Xiangdong	1
Harding, Luke	1
Hartig, Johannes	1
Isbell, Dan	1
Jin Chen	1
John Read	1
Khamboonruang, Apichat	1
King, R. A. R.	1
Klinger, Thorsten	1
Knoch, Ute	1
More ▼