ERIC - Search Results

Publication Date

In 2025	14
Since 2024	44

Descriptor

Language Tests	44
Test Items	44
Foreign Countries	32
Second Language Learning	30
English (Second Language)	29
Second Language Instruction	19
Language Proficiency	17
Item Analysis	13
Test Construction	12
Test Validity	12
College Students	11
Comparative Analysis	11
Difficulty Level	11
Undergraduate Students	10
Item Response Theory	9
Computer Assisted Testing	8
Scores	8
Vocabulary Development	8
Reading Comprehension	7
Test Reliability	7
Correlation	6
Language Processing	6
Reading Tests	6
Accuracy	5
Psychometrics	5
More ▼

Publication Type

Journal Articles	42
Reports - Research	40
Tests/Questionnaires	5
Information Analyses	3
Dissertations/Theses -…	1
Reports - Evaluative	1

Education Level

Higher Education	26
Postsecondary Education	26
Elementary Education	5
Secondary Education	4
Early Childhood Education	2
Grade 1	2
Grade 2	2
Grade 3	2
High Schools	2
Primary Education	2
Adult Education	1
Grade 8	1
Junior High Schools	1
Middle Schools	1
More ▼

Audience

Location

Iran	6
China	4
Japan	3
United Kingdom	3
Iran (Tehran)	2
Spain	2
Thailand	2
Vietnam	2
Europe	1
Japan (Tokyo)	1
New Zealand	1
Norway	1
Saudi Arabia	1
South Africa	1
South Korea	1
Sweden	1
Taiwan	1
Thailand (Bangkok)	1
Turkey	1
United Kingdom (England)	1
Uzbekistan	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

International English…	5
Pearson Test of English…	1
Test of English for…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 44 results Save | Export

Analysis of Mixed-Format Assessments Using Measurement Models and Topic Modeling

Peer reviewed

Direct link

Jiawei Xiong; George Engelhard; Allan S. Cohen – Measurement: Interdisciplinary Research and Perspectives, 2025

It is common to find mixed-format data results from the use of both multiple-choice (MC) and constructed-response (CR) questions on assessments. Dealing with these mixed response types involves understanding what the assessment is measuring, and the use of suitable measurement models to estimate latent abilities. Past research in educational…

Descriptors: Responses, Test Items, Test Format, Grade 8

Developing an MLA-Test for Young Learners -- Insights from Measurement Theory and Language Testing

Peer reviewed

Direct link

Kaja Haugen; Cecilie Hamnes Carlsen; Christine Möller-Omrani – Language Awareness, 2025

This article presents the process of constructing and validating a test of metalinguistic awareness (MLA) for young school children (age 8-10). The test was developed between 2021 and 2023 as part of the MetaLearn research project, financed by The Research Council of Norway. The research team defines MLA as using metalinguistic knowledge at a…

Descriptors: Language Tests, Test Construction, Elementary School Students, Metalinguistics

Evaluating Methodological Enhancements to the Yes/No Angoff Standard-Setting Method in Language Proficiency Assessment

Peer reviewed

Direct link

Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024

This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…

Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods

A Comparison of Yen's Q3 Coefficient and Rasch Testlet Modeling for Identifying Local Item Dependence: Evidence from Two Vocabulary Matching Tests

Peer reviewed

Direct link

Hung Tan Ha; Duyen Thi Bich Nguyen; Tim Stoeckel – Language Assessment Quarterly, 2025

This article compares two methods for detecting local item dependence (LID): residual correlation examination and Rasch testlet modeling (RTM), in a commonly used 3:6 matching format and an extended matching test (EMT) format. The two formats are hypothesized to facilitate different levels of item dependency due to differences in the number of…

Descriptors: Comparative Analysis, Language Tests, Test Items, Item Analysis

Comparative Evaluation of C-Test Reliability Using Classical and Modern Psychometric Methods

Peer reviewed
PDF on ERIC

Download full text

Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025

This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…

Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests

A Systematic Review of Differential Item Functioning in Second Language Assessment

Peer reviewed

Direct link

Xueliang Chen; Vahid Aryadoust; Wenxin Zhang – Language Testing, 2025

The growing diversity among test takers in second or foreign language (L2) assessments makes the importance of fairness front and center. This systematic review aimed to examine how fairness in L2 assessments was evaluated through differential item functioning (DIF) analysis. A total of 83 articles from 27 journals were included in a systematic…

Descriptors: Second Language Learning, Language Tests, Test Items, Item Analysis

A Three-Step DIF Analysis of a Reading Comprehension Test across Regional Dialects to Improve Test Score Validity

Peer reviewed

Direct link

Paula Elosua – Language Assessment Quarterly, 2024

In sociolinguistic contexts where standardized languages coexist with regional dialects, the study of differential item functioning is a valuable tool for examining certain linguistic uses or varieties as threats to score validity. From an ecological perspective, this paper describes three stages in the study of differential item functioning…

Descriptors: Reading Tests, Reading Comprehension, Scores, Test Validity

AI-Powered Automated Item Generation for Language Testing

Peer reviewed

Direct link

Dongkwang Shin; Jang Ho Lee – ELT Journal, 2024

Although automated item generation has gained a considerable amount of attention in a variety of fields, it is still a relatively new technology in ELT contexts. Therefore, the present article aims to provide an accessible introduction to this powerful resource for language teachers based on a review of the available research. Particularly, it…

Descriptors: Language Tests, Artificial Intelligence, Test Items, Automation

Argument-Based Validation of Chulalongkorn University Language Institute (CULI) Test: A Rasch-Based Evidence Investigation

Peer reviewed

Direct link

Apichat Khamboonruang – Language Testing in Asia, 2025

Chulalongkorn University Language Institute (CULI) test was developed as a local standardised test of English for professional and international communication. To ensure that the CULI test fulfils its intended purposes, this study employed Kane's argument-based validation and Rasch measurement approaches to construct the validity argument for the…

Descriptors: Universities, Second Language Learning, Second Language Instruction, Language Tests

Development of a Sign Repetition Task for Novice L2 Signers

Peer reviewed

Direct link

Ingela Holmström; Krister Schönström; Magnus Ryttervik – Language Assessment Quarterly, 2024

There is a lack of tests available for assessing sign language proficiency among L2 learners. We have therefore developed a sign repetition test, SignRepL2, with a specific focus on the phonological features of signs. This paper describes the two phases of developing this test. In the first phase, content was developed in the form of 50 items with…

Descriptors: Sign Language, Novices, Task Analysis, Second Language Learning

Examining the Effect of Item Difficulty and Rater Leniency on Iranian Test Takers' Performance on WDCT and DSAT: A Comparative Study

Peer reviewed
PDF on ERIC

Download full text

Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025

The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…

Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction

Modeling Local Item Dependence in Cloze Tests with the Rasch Model: Applying a New Strategy

Peer reviewed
PDF on ERIC

Download full text

Barno S. Abdullaeva; Diyorjon Abdullaev; Nurislom I. Khursanov; Khurshida B. Kadirova; Laylo Djuraeva – International Journal of Language Testing, 2024

Cloze tests are commonly used in language testing as a quick measure of overall language ability or reading comprehension. A problem for the analysis of cloze tests with item response theory models is that cloze test items are locally dependent. This leads to the violation of the conditional or local independence assumption of IRT models. In this…

Descriptors: Cloze Procedure, Language Tests, Test Items, Correlation

Developing Internet-Based "Tests of Aptitude for Language Learning (TALL)": An Open Research Endeavour

Peer reviewed

Direct link

Junlan Pan; Emma Marsden – Language Testing, 2024

"Tests of Aptitude for Language Learning" (TALL) is an openly accessible internet-based battery to measure the multifaceted construct of foreign language aptitude, using language domain-specific instruments and L1-sensitive instructions and stimuli. This brief report introduces the components of this theory-informed battery and…

Descriptors: Language Tests, Aptitude Tests, Second Language Learning, Test Construction

A Rasch-Based Validation of the University of Tehran English Proficiency Test (UTEPT)

Peer reviewed

Direct link

Shadi Noroozi; Hossein Karami – Language Testing in Asia, 2024

Recently, psychometricians and researchers have voiced their concern over the exploration of language test items in light of Messick's validation framework. Validity has been central to test development and use; however, it has not received due attention in language tests having grave consequences for test takers. The present study sought to…

Descriptors: Foreign Countries, Doctoral Students, Graduate Students, Language Proficiency

A New Scoring Method for Item Response Theory Analysis of C-Tests

Peer reviewed

Direct link

Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025

This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…

Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction

Previous Page | Next Page »

Pages: 1 | 2 | 3

International Journal of…	7
Language Testing	5
Vocabulary Learning and…	4
Language Assessment Quarterly	3
Applied Measurement in…	2
Language Testing in Asia	2
rEFLections	2
Annenberg Institute for…	1
Computer Assisted Language…	1
ELT Journal	1
Education and Information…	1
English Teaching	1
International Journal of…	1
International Journal of…	1
Journal of Second Language…	1
Language Awareness	1
Language Education &…	1
Language Teaching Research	1
Measurement:…	1
Online Submission	1
ProQuest LLC	1
SAGE Open	1
Second Language Research	1
South African Journal of…	1
TESL-EJ	1
More ▼

Tim Stoeckel	3
Emma Marsden	2
Hung Tan Ha	2
James S. Kim	2
Joshua B. Gilbert	2
Luke W. Miratrix	2
Tomoko Ishii	2
Afsar Rouhi	1
Ali Zahabi	1
Allan S. Cohen	1
Amber Dudley	1
Anastasia Pattemore	1
Anne Dahl	1
Apichat Khamboonruang	1
Ayaka Sugawara	1
Ayako Aizawa	1
Barno S. Abdullaeva	1
Bayram Çetin	1
Budi Waluyo	1
Carmen Muñoz	1
Cecilie Hamnes Carlsen	1
Christine Möller-Omrani	1
Chunmei Huang	1
Daniela Avello	1
Dave Kush	1
More ▼