ERIC - Search Results

Publication Date

In 2025

Descriptor

Language Tests	14
Test Items	14
Foreign Countries	11
English (Second Language)	10
Second Language Learning	9
College Students	6
Comparative Analysis	6
Item Analysis	6
Second Language Instruction	5
Test Validity	5
Difficulty Level	4
Test Format	4
Test Reliability	4
Accuracy	3
Item Response Theory	3
Psychometrics	3
Scores	3
Student Attitudes	3
Test Construction	3
Undergraduate Students	3
Artificial Intelligence	2
Computer Assisted Testing	2
Construct Validity	2
Correlation	2
High Stakes Tests	2
More ▼

Source

International Journal of…	3
Language Testing	2
International Journal of…	1
International Journal of…	1
Language Assessment Quarterly	1
Language Awareness	1
Language Education &…	1
Language Testing in Asia	1
Measurement:…	1
SAGE Open	1
Vocabulary Learning and…	1
More ▼

Publication Type

Journal Articles	14
Reports - Research	13
Information Analyses	1
Tests/Questionnaires	1

Education Level

Higher Education	11
Postsecondary Education	11
Elementary Education	2
Adult Education	1
Grade 8	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Iran	3
United Kingdom	3
China	2
Japan	1
Norway	1
Thailand (Bangkok)	1
Turkey	1
Vietnam	1

Laws, Policies, & Programs

Assessments and Surveys

International English…	3
Pearson Test of English…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Analysis of Mixed-Format Assessments Using Measurement Models and Topic Modeling

Peer reviewed

Direct link

Jiawei Xiong; George Engelhard; Allan S. Cohen – Measurement: Interdisciplinary Research and Perspectives, 2025

It is common to find mixed-format data results from the use of both multiple-choice (MC) and constructed-response (CR) questions on assessments. Dealing with these mixed response types involves understanding what the assessment is measuring, and the use of suitable measurement models to estimate latent abilities. Past research in educational…

Descriptors: Responses, Test Items, Test Format, Grade 8

Developing an MLA-Test for Young Learners -- Insights from Measurement Theory and Language Testing

Peer reviewed

Direct link

Kaja Haugen; Cecilie Hamnes Carlsen; Christine Möller-Omrani – Language Awareness, 2025

This article presents the process of constructing and validating a test of metalinguistic awareness (MLA) for young school children (age 8-10). The test was developed between 2021 and 2023 as part of the MetaLearn research project, financed by The Research Council of Norway. The research team defines MLA as using metalinguistic knowledge at a…

Descriptors: Language Tests, Test Construction, Elementary School Students, Metalinguistics

A Comparison of Yen's Q3 Coefficient and Rasch Testlet Modeling for Identifying Local Item Dependence: Evidence from Two Vocabulary Matching Tests

Peer reviewed

Direct link

Hung Tan Ha; Duyen Thi Bich Nguyen; Tim Stoeckel – Language Assessment Quarterly, 2025

This article compares two methods for detecting local item dependence (LID): residual correlation examination and Rasch testlet modeling (RTM), in a commonly used 3:6 matching format and an extended matching test (EMT) format. The two formats are hypothesized to facilitate different levels of item dependency due to differences in the number of…

Descriptors: Comparative Analysis, Language Tests, Test Items, Item Analysis

Comparative Evaluation of C-Test Reliability Using Classical and Modern Psychometric Methods

Peer reviewed
PDF on ERIC

Download full text

Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025

This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…

Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests

A Systematic Review of Differential Item Functioning in Second Language Assessment

Peer reviewed

Direct link

Xueliang Chen; Vahid Aryadoust; Wenxin Zhang – Language Testing, 2025

The growing diversity among test takers in second or foreign language (L2) assessments makes the importance of fairness front and center. This systematic review aimed to examine how fairness in L2 assessments was evaluated through differential item functioning (DIF) analysis. A total of 83 articles from 27 journals were included in a systematic…

Descriptors: Second Language Learning, Language Tests, Test Items, Item Analysis

Argument-Based Validation of Chulalongkorn University Language Institute (CULI) Test: A Rasch-Based Evidence Investigation

Peer reviewed

Direct link

Apichat Khamboonruang – Language Testing in Asia, 2025

Chulalongkorn University Language Institute (CULI) test was developed as a local standardised test of English for professional and international communication. To ensure that the CULI test fulfils its intended purposes, this study employed Kane's argument-based validation and Rasch measurement approaches to construct the validity argument for the…

Descriptors: Universities, Second Language Learning, Second Language Instruction, Language Tests

Examining the Effect of Item Difficulty and Rater Leniency on Iranian Test Takers' Performance on WDCT and DSAT: A Comparative Study

Peer reviewed
PDF on ERIC

Download full text

Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025

The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…

Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction

A New Scoring Method for Item Response Theory Analysis of C-Tests

Peer reviewed

Direct link

Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025

This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…

Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction

Constructing and Validating a Q-Matrix for Cognitive Diagnostic Analysis of the Listening Comprehension Section of the IELTS

Peer reviewed
PDF on ERIC

Download full text

Seyedeh Azadeh Ghiasian; Fatemeh Hemmati; Seyyed Mohammad Alavi; Afsar Rouhi – International Journal of Language Testing, 2025

A critical component of cognitive diagnostic models (CDMs) is a Q-matrix that stipulates associations between items of a test and their required attributes. The present study aims to develop and empirically validate a Q-matrix for the listening comprehension section of the International English Language Testing System (IELTS). To this end, a…

Descriptors: Test Items, Listening Comprehension Tests, English (Second Language), Language Tests

Investigating the Role of Response Format in Computer-Based Lecture Comprehension Tasks

Peer reviewed

Direct link

Stefan O'Grady – International Journal of Listening, 2025

Language assessment is increasingly computermediated. This development presents opportunities with new task formats and equally a need for renewed scrutiny of established conventions. Recent recommendations to increase integrated skills assessment in lecture comprehension tests is premised on empirical research that demonstrates enhanced construct…

Descriptors: Language Tests, Lecture Method, Listening Comprehension Tests, Multiple Choice Tests

Examining the Cut-Off Score of the English B1 Progression Exam According to Different Standard Setting Methods

Peer reviewed
PDF on ERIC

Download full text

Rümeysa Kaya; Bayram Çetin – International Journal of Assessment Tools in Education, 2025

In this study, the cut-off scores obtained from the Angoff, Angoff Y/N, Nedelsky and Ebel standard methods were compared with the 50 T score and the current cut-off score in various aspects. Data were collected from 448 students who took Module B1+ English Exit Exam IV and 14 experts. It was seen that while the Nedelsky method gave the lowest…

Descriptors: Standard Setting, Cutting Scores, Exit Examinations, Academic Achievement

A Case Study of Washback and Test Preparation of the New Version of PTE Academic

Peer reviewed
PDF on ERIC

Download full text

Yi Zou; Ying Zheng; Jingwen Wang – International Journal of Language Testing, 2025

The Pearson Test of English Academic (PTE-A), a widely used high-stakes language proficiency test for university admissions and migration purposes, underwent a notable change from a three-hour to a two-hour version in November 2021. The implementation of the new version has prompted inquiries into the washback effects on various stakeholders.…

Descriptors: Testing Problems, Test Preparation, High Stakes Tests, English (Second Language)

Second Language Learners' Perceptions of Conversations in English Materials

Peer reviewed

Direct link

Yu Hui; Thora Tenbrink – SAGE Open, 2025

This study addresses how Chinese learners of English as a second language (L2) perceive conversations in English materials as compared to speakers of English as a first language (L1). Data were collected through questionnaires completed by 48 participants (28 L2 English learners in China and 20 L1 English speakers in the UK), eliciting evaluations…

Descriptors: Foreign Countries, Cultural Context, English (Second Language), Second Language Learning

Evaluation of Automated Vocabulary Quiz Generation with VocQGen

Peer reviewed
PDF on ERIC

Download full text

Qiao Wang; Ralph L. Rose; Ayaka Sugawara; Naho Orita – Vocabulary Learning and Instruction, 2025

VocQGen is an automated tool designed to generate multiple-choice cloze (MCC) questions for vocabulary assessment in second language learning contexts. It leverages several natural language processing (NLP) tools and OpenAI's GPT-4 model to produce MCC items quickly from user-specified word lists. To evaluate its effectiveness, we used the first…

Descriptors: Vocabulary Skills, Artificial Intelligence, Computer Software, Multiple Choice Tests

Afsar Rouhi	1
Allan S. Cohen	1
Apichat Khamboonruang	1
Ayaka Sugawara	1
Bayram Çetin	1
Cecilie Hamnes Carlsen	1
Christine Möller-Omrani	1
Duyen Thi Bich Nguyen	1
Esmat Babaii	1
Farshad Effatpanah	1
Fatemeh Hemmati	1
George Engelhard	1
Golam Reza Rohani	1
Hamdollah Ravand	1
Hung Tan Ha	1
Jiawei Xiong	1
Jingwen Wang	1
Kaja Haugen	1
Mohsen Kianinezhad	1
Mona Tabatabaee-Yazdi	1
Naho Orita	1
Neda Kianinezhad	1
Purya Baghaei	1
Qiao Wang	1
Ralph L. Rose	1
More ▼