ERIC - Search Results

Publication Date

In 2025	5
Since 2024	8

Descriptor

Item Analysis	8
Test Format	8
Test Items	7
Item Response Theory	5
Foreign Countries	4
Comparative Analysis	3
Language Tests	3
Accuracy	2
Achievement Tests	2
Classification	2
College Students	2
Computer Assisted Testing	2
Evaluation Methods	2
Second Language Learning	2
Artificial Intelligence	1
Bayesian Statistics	1
Biology	1
Chemistry	1
Computer Software	1
Correlation	1
Cues	1
Culture Fair Tests	1
Definitions	1
Difficulty Level	1
Educational Assessment	1
More ▼

Source

Applied Measurement in…	1
International Journal of…	1
Journal of Education and…	1
Journal of Educational and…	1
Language Assessment Quarterly	1
Language Testing	1
Research Matters	1
Vocabulary Learning and…	1

Author

Tim Stoeckel	2
Ahmed Al - Badri	1
Duyen Thi Bich Nguyen	1
Emma Walland	1
Hung Tan Ha	1
Ki Lynn Cole	1
Lixin Yuan	1
Mimi Ismail	1
Minqiang Zhang	1
Said Al - Senaidi	1
Shaojie Wang	1
Sohee Kim	1
Susu Zhang	1
Tomoko Ishii	1
Vahid Aryadoust	1
Wenxin Zhang	1
Won-Chan Lee	1
Xueliang Chen	1
Yang Du	1
More ▼

Publication Type

Journal Articles	8
Reports - Research	6
Information Analyses	2
Reports - Evaluative	1

Education Level

Higher Education	3
Postsecondary Education	3
Secondary Education	1

Audience

Location

Japan (Tokyo)	1
Oman	1
United Kingdom (England)	1
Vietnam	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 8 results Save | Export

IRT Linking Methods for the Bifactor Model with Mixed Format Tests

Peer reviewed

Direct link

Sohee Kim; Ki Lynn Cole – International Journal of Testing, 2025

This study conducted a comprehensive comparison of Item Response Theory (IRT) linking methods applied to a bifactor model, examining their performance on both multiple choice (MC) and mixed format tests within the common item nonequivalent group design framework. Four distinct multidimensional IRT linking approaches were explored, consisting of…

Descriptors: Item Response Theory, Comparative Analysis, Models, Item Analysis

A Comparison of Yen's Q3 Coefficient and Rasch Testlet Modeling for Identifying Local Item Dependence: Evidence from Two Vocabulary Matching Tests

Peer reviewed

Direct link

Hung Tan Ha; Duyen Thi Bich Nguyen; Tim Stoeckel – Language Assessment Quarterly, 2025

This article compares two methods for detecting local item dependence (LID): residual correlation examination and Rasch testlet modeling (RTM), in a commonly used 3:6 matching format and an extended matching test (EMT) format. The two formats are hypothesized to facilitate different levels of item dependency due to differences in the number of…

Descriptors: Comparative Analysis, Language Tests, Test Items, Item Analysis

A Systematic Review of Differential Item Functioning in Second Language Assessment

Peer reviewed

Direct link

Xueliang Chen; Vahid Aryadoust; Wenxin Zhang – Language Testing, 2025

The growing diversity among test takers in second or foreign language (L2) assessments makes the importance of fairness front and center. This systematic review aimed to examine how fairness in L2 assessments was evaluated through differential item functioning (DIF) analysis. A total of 83 articles from 27 journals were included in a systematic…

Descriptors: Second Language Learning, Language Tests, Test Items, Item Analysis

Detecting Compromised Items with Response Times Using a Bayesian Change-Point Approach

Peer reviewed

Direct link

Yang Du; Susu Zhang – Journal of Educational and Behavioral Statistics, 2025

Item compromise has long posed challenges in educational measurement, jeopardizing both test validity and test security of continuous tests. Detecting compromised items is therefore crucial to address this concern. The present literature on compromised item detection reveals two notable gaps: First, the majority of existing methods are based upon…

Descriptors: Item Response Theory, Item Analysis, Bayesian Statistics, Educational Assessment

IRT Characteristic Curve Linking Methods Weighted by Information for Mixed-Format Tests

Peer reviewed

Direct link

Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024

To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…

Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement

Exploring Speededness in Pre-Reform GCSEs (2009 to 2016)

Download full text

Direct link

Emma Walland – Research Matters, 2024

GCSE examinations (taken by students aged 16 years in England) are not intended to be speeded (i.e. to be partly a test of how quickly students can answer questions). However, there has been little research exploring this. The aim of this research was to explore the speededness of past GCSE written examinations, using only the data from scored…

Descriptors: Educational Change, Test Items, Item Analysis, Scoring

Evaluating the Effectiveness of a Computerized Achievement Test Using Learn Smart for Psychometric Assessment under Item Response Theory

Peer reviewed
PDF on ERIC

Download full text

Mimi Ismail; Ahmed Al - Badri; Said Al - Senaidi – Journal of Education and e-Learning Research, 2025

This study aimed to reveal the differences in individuals' abilities, their standard errors, and the psychometric properties of the test according to the two methods of applying the test (electronic and paper). The descriptive approach was used to achieve the study's objectives. The study sample consisted of 74 male and female students at the…

Descriptors: Achievement Tests, Computer Assisted Testing, Psychometrics, Item Response Theory

An Exploratory Criterion Validation of Three Meaning-Recall Vocabulary Test Item Formats

Peer reviewed
PDF on ERIC

Download full text

Tim Stoeckel; Tomoko Ishii – Vocabulary Learning and Instruction, 2024

In an upcoming coverage-comprehension study, we plan to assess learners' meaning-recall knowledge of words as they occur in the study's reading passage. As several meaning-recall test formats exist, the purpose of this small-scale study (N = 10) was to determine which of three formats was most similar to a criterion interview regarding mean score…

Descriptors: Vocabulary Development, Language Tests, Second Language Learning, Classification