Publication Date
In 2025 | 2 |
Since 2024 | 4 |
Since 2021 (last 5 years) | 7 |
Since 2016 (last 10 years) | 13 |
Since 2006 (last 20 years) | 20 |
Descriptor
Source
Language Testing | 22 |
Author
Ann Tai Choe | 1 |
Bridgeman, Brent | 1 |
Cho, Yeonsuk | 1 |
Choi, Ikkyu | 1 |
Daniel Holden | 1 |
Daniel R. Isbell | 1 |
Davidson, Fred | 1 |
Davies, Alan | 1 |
Deygers, Bart | 1 |
DiPietro, Stephen | 1 |
Drackert, Anastasia | 1 |
More ▼ |
Publication Type
Journal Articles | 22 |
Reports - Evaluative | 10 |
Reports - Research | 10 |
Information Analyses | 1 |
Opinion Papers | 1 |
Reports - Descriptive | 1 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 3 |
Postsecondary Education | 3 |
Secondary Education | 3 |
Elementary Education | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 4 |
ACTFL Oral Proficiency… | 1 |
English Proficiency Test | 1 |
What Works Clearinghouse Rating
Ying Xu; Xiaodong Li; Jin Chen – Language Testing, 2025
This article provides a detailed review of the Computer-based English Listening Speaking Test (CELST) used in Guangdong, China, as part of the National Matriculation English Test (NMET) to assess students' English proficiency. The CELST measures listening and speaking skills as outlined in the "English Curriculum for Senior Middle…
Descriptors: Computer Assisted Testing, English (Second Language), Language Tests, Listening Comprehension Tests
Knoch, Ute; Deygers, Bart; Khamboonruang, Apichat – Language Testing, 2021
Rating scale development in the field of language assessment is often considered in dichotomous ways: It is assumed to be guided either by expert intuition or by drawing on performance data. Even though quite a few authors have argued that rating scale development is rarely so easily classifiable, this dyadic view has dominated language testing…
Descriptors: Rating Scales, Test Construction, Language Tests, Test Use
Yu-Tzu Chang; Ann Tai Choe; Daniel Holden; Daniel R. Isbell – Language Testing, 2024
In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016-2021, including 434 essays) using…
Descriptors: Language Tests, Rating Scales, Second Language Learning, English (Second Language)
Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025
This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…
Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction
Maria Treadaway; John Read – Language Testing, 2024
Standard-setting is an essential component of test development, supporting the meaningfulness and appropriate interpretation of test scores. However, in the high-stakes testing environment of aviation, standard-setting studies are underexplored. To address this gap, we document two stages in the standard-setting procedures for the Overseas Flight…
Descriptors: Standard Setting, Diagnostic Tests, High Stakes Tests, English for Special Purposes
Liao, Ray J. T. – Language Testing, 2023
Among the variety of selected response formats used in L2 reading assessment, multiple-choice (MC) is the most commonly adopted, primarily due to its efficiency and objectiveness. Given the impact of assessment results on teaching and learning, it is necessary to investigate the degree to which the MC format reliably measures learners' L2 reading…
Descriptors: Reading Tests, Language Tests, Second Language Learning, Second Language Instruction
Schnoor, Birger; Hartig, Johannes; Klinger, Thorsten; Naumann, Alexander; Usanova, Irina – Language Testing, 2023
Research on assessing English as a foreign language (EFL) development has been growing recently. However, empirical evidence from longitudinal analyses based on substantial samples is still needed. In such settings, tests for measuring language development must meet high standards of test quality such as validity, reliability, and objectivity, as…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Longitudinal Studies
Norris, John; Drackert, Anastasia – Language Testing, 2018
The Test of German as a Foreign Language (TestDaF) plays a critical role as a standardized test of German language proficiency. Developed and administered by the Society for Academic Study Preparation and Test Development (g.a.s.t.), TestDaF was launched in 2001 and has experienced persistent annual growth, with more than 44,000 test takers in…
Descriptors: German, Second Language Learning, Language Tests, Language Proficiency
Isbell, Dan; Winke, Paula – Language Testing, 2019
The American Council on the Teaching of Foreign Languages (ACTFL) oral proficiency interview -- computer (OPIc) testing system represents an ambitious effort in language assessment: Assessing oral proficiency in over a dozen languages, on the same scale, from virtually anywhere at any time. Especially for users in contexts where multiple foreign…
Descriptors: Oral Language, Language Tests, Language Proficiency, Second Language Learning
Choi, Ikkyu; Papageorgiou, Spiros – Language Testing, 2020
Stakeholders of language tests are often interested in subscores. However, reporting a subscore is not always justified; a subscore should provide reliable and distinct information to be worth reporting. When a subscore is used for decisions across multiple levels (e.g., individual test takers and schools), it needs to be justified for its…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Scores
Zhao, Hulin; Gu, Xiangdong – Language Testing, 2016
Test Purpose: The CATTI aims to measure competence in translation and interpreting (including simultaneous and consecutive interpreting) between Chinese and seven foreign languages: English, Japanese, French, Arabic, Russian, German, or Spanish. The test is intended to cover a wide range of domains including business, government, academia, and…
Descriptors: Accreditation (Institutions), Foreign Countries, Translation, Chinese
Sawaki, Yasuyo; Sinharay, Sandip – Language Testing, 2018
The present study examined the reliability of the reading, listening, speaking, and writing section scores for the TOEFL iBT® test and their interrelationship in order to collect empirical evidence to support, respectively, the "generalization" inference and the "explanation" inference in the TOEFL iBT validity argument…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Computer Assisted Testing
Bridgeman, Brent; Cho, Yeonsuk; DiPietro, Stephen – Language Testing, 2016
Data from 787 international undergraduate students at an urban university in the United States were used to demonstrate the importance of separating a sample into meaningful subgroups in order to demonstrate the ability of an English language assessment to predict the first-year grade point average (GPA). For example, when all students were pooled…
Descriptors: Grade Prediction, English Curriculum, Language Tests, Undergraduate Students
Davies, Alan – Language Testing, 2012
In this article, the author begins by discussing four challenges on the concept of validity. These challenges are: (1) the appeal to logic and syllogistic reasoning; (2) the claim of reliability; (3) the local and the universal; and (4) the unitary and the divisible. In language testing validity cannot be achieved directly but only through a…
Descriptors: Language Tests, Test Validity, Test Reliability, Testing
O'Toole, J. M.; King, R. A. R. – Language Testing, 2011
The "cloze" test is one possible investigative instrument for predicting text comprehensibility. Conceptual coding of student replacement of deleted words has been considered to be more valid than exact coding, partly because conceptual coding seemed fairer to poorer readers. This paper reports a quantitative study of 447 Australian…
Descriptors: Cloze Procedure, Test Results, Language Tests, Reading Comprehension
Previous Page | Next Page »
Pages: 1 | 2