ERIC - Search Results

Publication Date

In 2026	0
Since 2025	2
Since 2022 (last 5 years)	12
Since 2017 (last 10 years)	27
Since 2007 (last 20 years)	51

Descriptor

Test Reliability	71
Language Tests	59
Second Language Learning	43
Test Validity	38
English (Second Language)	32
Foreign Countries	27
Language Proficiency	26
Scores	20
Test Construction	17
Testing	13
Comparative Analysis	12
Test Items	10
Listening Comprehension Tests	9
Reading Comprehension	9
Test Format	9
Correlation	8
Item Response Theory	8
Reading Tests	8
Secondary School Students	8
Computer Assisted Testing	7
Factor Analysis	7
Oral Language	7
Psychometrics	7
Scoring	7
College Students	6
More ▼

Source

Language Testing

Publication Type

Journal Articles	71
Reports - Research	43
Reports - Evaluative	16
Reports - Descriptive	8
Information Analyses	4
Tests/Questionnaires	4
Opinion Papers	3
Speeches/Meeting Papers	1

Education Level

Higher Education	10
Secondary Education	9
Postsecondary Education	7
Elementary Education	4
Junior High Schools	3
Middle Schools	3
Elementary Secondary Education	2
Grade 12	1
Grade 7	1
High Schools	1

Audience

Location

China	5
Germany	4
Australia	3
Netherlands	3
Canada	2
France	2
United Kingdom	2
Bulgaria	1
China (Guangzhou)	1
Denmark	1
Hawaii	1
Hong Kong	1
Illinois	1
Indiana	1
Iran	1
Italy	1
Japan	1
Kenya	1
Pennsylvania (Philadelphia)	1
Poland	1
Russia	1
South Korea	1
Switzerland	1
Taiwan	1
Turkey	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	7
ACTFL Oral Proficiency…	1
English Proficiency Test	1
International English…	1
Test of Written English	1

What Works Clearinghouse Rating

Showing 1 to 15 of 71 results Save | Export

A Shortened Test Is Feasible: Evaluating a Large-Scale Multistage Adaptive English Language Assessment

Peer reviewed

Direct link

Shangchao Min; Kyoungwon Bishop – Language Testing, 2024

This paper evaluates the multistage adaptive test (MST) design of a large-scale academic language assessment (ACCESS) for Grades 1-12, with an aim to simplify the current MST design, using both operational and simulated test data. Study 1 explored the operational population data (1,456,287 test-takers) of the listening and reading tests of MST…

Descriptors: Adaptive Testing, Test Construction, Language Tests, English Language Learners

Test Review: Computer-Based English Listening and Speaking Test (CELST) of National Matriculation English Test (NMET) Guangdong Version in China

Peer reviewed

Direct link

Ying Xu; Xiaodong Li; Jin Chen – Language Testing, 2025

This article provides a detailed review of the Computer-based English Listening Speaking Test (CELST) used in Guangdong, China, as part of the National Matriculation English Test (NMET) to assess students' English proficiency. The CELST measures listening and speaking skills as outlined in the "English Curriculum for Senior Middle…

Descriptors: Computer Assisted Testing, English (Second Language), Language Tests, Listening Comprehension Tests

Developing Internet-Based "Tests of Aptitude for Language Learning (TALL)": An Open Research Endeavour

Peer reviewed

Direct link

Junlan Pan; Emma Marsden – Language Testing, 2024

"Tests of Aptitude for Language Learning" (TALL) is an openly accessible internet-based battery to measure the multifaceted construct of foreign language aptitude, using language domain-specific instruments and L1-sensitive instructions and stimuli. This brief report introduces the components of this theory-informed battery and…

Descriptors: Language Tests, Aptitude Tests, Second Language Learning, Test Construction

Revisiting Rating Scale Development for Rater-Mediated Language Performance Assessments: Modelling Construct and Contextual Choices Made by Scale Developers

Peer reviewed

Direct link

Knoch, Ute; Deygers, Bart; Khamboonruang, Apichat – Language Testing, 2021

Rating scale development in the field of language assessment is often considered in dichotomous ways: It is assumed to be guided either by expert intuition or by drawing on performance data. Even though quite a few authors have argued that rating scale development is rarely so easily classifiable, this dyadic view has dominated language testing…

Descriptors: Rating Scales, Test Construction, Language Tests, Test Use

Making Each Point Count: Revising a Local Adaptation of the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE Rubric

Peer reviewed

Direct link

Yu-Tzu Chang; Ann Tai Choe; Daniel Holden; Daniel R. Isbell – Language Testing, 2024

In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016-2021, including 434 essays) using…

Descriptors: Language Tests, Rating Scales, Second Language Learning, English (Second Language)

A New Scoring Method for Item Response Theory Analysis of C-Tests

Peer reviewed

Direct link

Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025

This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…

Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction

The Typology of Second Language Listening Constructs: A Systematic Review

Peer reviewed

Direct link

Aryadoust, Vahid; Luo, Lan – Language Testing, 2023

This study reviewed conceptualizations and operationalizations of second language (L2) listening constructs. A total of 157 peer-reviewed papers published in 19 journals in applied linguistics were coded for (1) publication year, author, source title, location, language, and reliability and (2) listening subskills, cognitive processes, attributes,…

Descriptors: Test Format, Listening Comprehension Tests, Second Language Learning, Second Language Instruction

Operationalizing the Reading-into-Writing Construct in Analytic Rating Scales: Effects of Different Approaches on Rating

Peer reviewed

Direct link

Lestari, Santi B.; Brunfaut, Tineke – Language Testing, 2023

Assessing integrated reading-into-writing task performances is known to be challenging, and analytic rating scales have been found to better facilitate the scoring of these performances than other common types of rating scales. However, little is known about how specific operationalizations of the reading-into-writing construct in analytic rating…

Descriptors: Reading Writing Relationship, Writing Tests, Rating Scales, Writing Processes

Developing a Local Academic English Listening Test Using Authentic Unscripted Audio-Visual Texts

Peer reviewed

Direct link

Park, Yena; Lee, Senyung; Shin, Sun-Young – Language Testing, 2022

Despite consistent calls for authentic stimuli in listening tests for better construct representation, unscripted texts have been rarely adopted in high-stakes listening tests due to perceived inefficiency. This study details how a local academic listening test was developed using authentic unscripted audio-visual texts from the local target…

Descriptors: Listening Comprehension Tests, English for Academic Purposes, Test Construction, Foreign Students

Korean Syntactic Complexity Analyzer (KOSCA): An NLP Application for the Analysis of Syntactic Complexity in Second Language Production

Peer reviewed

Direct link

Haerim Hwang; Hyunwoo Kim – Language Testing, 2024

Given the lack of computational tools available for assessing second language (L2) production in Korean, this study introduces a novel automated tool called the Korean Syntactic Complexity Analyzer (KOSCA) for measuring syntactic complexity in L2 Korean production. As an open-source graphic user interface (GUI) developed in Python, KOSCA provides…

Descriptors: Korean, Natural Language Processing, Syntax, Computer Graphics

Automated Scoring of Junior and Senior High Essays Using Coh-Metrix Features: Implications for Large-Scale Language Testing

Peer reviewed

Direct link

Latifi, Syed; Gierl, Mark – Language Testing, 2021

An automated essay scoring (AES) program is a software system that uses techniques from corpus and computational linguistics and machine learning to grade essays. In this study, we aimed to describe and evaluate particular language features of Coh-Metrix for a novel AES program that would score junior and senior high school students' essays from…

Descriptors: Writing Evaluation, Computer Assisted Testing, Scoring, Essays

Setting Standards for a Diagnostic Test of Aviation English for Student Pilots

Peer reviewed

Direct link

Maria Treadaway; John Read – Language Testing, 2024

Standard-setting is an essential component of test development, supporting the meaningfulness and appropriate interpretation of test scores. However, in the high-stakes testing environment of aviation, standard-setting studies are underexplored. To address this gap, we document two stages in the standard-setting procedures for the Overseas Flight…

Descriptors: Standard Setting, Diagnostic Tests, High Stakes Tests, English for Special Purposes

A Meta-Analysis of Self-Assessment and Language Performance in Language Testing and Assessment

Peer reviewed

Direct link

Li, Minzi; Zhang, Xian – Language Testing, 2021

This meta-analysis explores the correlation between self-assessment (SA) and language performance. Sixty-seven studies with 97 independent samples involving more than 68,500 participants were included in our analysis. It was found that the overall correlation between SA and language performance was 0.466 (p < 0.01). Moderator analysis was…

Descriptors: Meta Analysis, Self Evaluation (Individuals), Likert Scales, Research Reports

The Use of Generalizability Theory in Investigating the Score Dependability of Classroom-Based L2 Reading Assessment

Peer reviewed

Direct link

Liao, Ray J. T. – Language Testing, 2023

Among the variety of selected response formats used in L2 reading assessment, multiple-choice (MC) is the most commonly adopted, primarily due to its efficiency and objectiveness. Given the impact of assessment results on teaching and learning, it is necessary to investigate the degree to which the MC format reliably measures learners' L2 reading…

Descriptors: Reading Tests, Language Tests, Second Language Learning, Second Language Instruction

A Comprehensive Review of Rasch Measurement in Language Assessment: Recommendations and Guidelines for Research

Peer reviewed

Direct link

Aryadoust, Vahid; Ng, Li Ying; Sayama, Hiroki – Language Testing, 2021

Over the past decades, the application of Rasch measurement in language assessment has gradually increased. In the present study, we coded 215 papers using Rasch measurement published in 21 applied linguistics journals for multiple features. We found that seven Rasch models and 23 software packages were adopted in these papers, with many-facet…

Descriptors: Language Tests, Testing, Test Items, Network Analysis

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Alderson, J. Charles	2
Aryadoust, Vahid	2
Brown, James Dean	2
Haug, Tobias	2
Knoch, Ute	2
Lee, Yong-Won	2
Stansfield, Charles W.	2
Allan, Alistair	1
Ann Tai Choe	1
Audeoud, Mireille	1
August, Diane	1
Batty, Aaron Olaf	1
Bridgeman, Brent	1
Brunfaut, Tineke	1
Cakir, Abdulvahit	1
Carlo, Maria	1
Chapelle, Carol A.	1
Cho, Yeonsuk	1
Choi, Ikkyu	1
Chung, Yoo-Ree	1
Clenton, Jon	1
Coniam, David	1
Coombe, Christine	1
Culligan, Brent	1
More ▼