ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	9

Descriptor

Error of Measurement	9
Test Items	9
Item Response Theory	5
Scores	4
Accuracy	2
Comparative Analysis	2
Correlation	2
Difficulty Level	2
Item Analysis	2
Sample Size	2
Simulation	2
Test Bias	2
Test Length	2
Testing Problems	2
Ability	1
Ability Grouping	1
Achievement Tests	1
Adaptive Testing	1
Classification	1
Comparative Education	1
Computation	1
Computer Assisted Testing	1
Culture Fair Tests	1
Cutting Scores	1
Decision Making	1
More ▼

Source

International Journal of…

Publication Type

Journal Articles	9
Reports - Research	7
Reports - Descriptive	1
Reports - Evaluative	1

Education Level

Elementary Secondary Education	1
Grade 3	1
Grade 5	1
Grade 7	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Program for International…

What Works Clearinghouse Rating

Showing all 9 results Save | Export

Detecting Differential Item Functioning with Multiple Causes: A Comparison of Three Methods

Peer reviewed

Direct link

Xiaowen Liu – International Journal of Testing, 2024

Differential item functioning (DIF) often arises from multiple sources. Within the context of multidimensional item response theory, this study examined DIF items with varying secondary dimensions using the three DIF methods: SIBTEST, Mantel-Haenszel, and logistic regression. The effect of the number of secondary dimensions on DIF detection rates…

Descriptors: Item Analysis, Test Items, Item Response Theory, Correlation

Beyond Group Comparisons: Accounting for Intersectional Sources of Bias in International Survey Measures

Peer reviewed

Direct link

Rujun Xu; James Soland – International Journal of Testing, 2024

International surveys are increasingly being used to understand nonacademic outcomes like math and science motivation, and to inform education policy changes within countries. Such instruments assume that the measure works consistently across countries, ethnicities, and languages--that is, they assume measurement invariance. While studies have…

Descriptors: Surveys, Statistical Bias, Achievement Tests, Foreign Countries

FIPC Linking across Multidimensional Test Forms: Effects of Confounding Difficulty within Dimensions

Peer reviewed

Direct link

Kim, Sohee; Cole, Ki Lynn; Mwavita, Mwarumba – International Journal of Testing, 2018

This study investigated the effects of linking potentially multidimensional test forms using the fixed item parameter calibration. Forms had equal or unequal total test difficulty with and without confounding difficulty. The mean square errors and bias of estimated item and ability parameters were compared across the various confounding tests. The…

Descriptors: Test Items, Item Response Theory, Test Format, Difficulty Level

Item Parameter Drift in Computer Adaptive Testing Due to Lack of Content Knowledge

Peer reviewed

Direct link

Aksu Dunya, Beyza – International Journal of Testing, 2018

This study was conducted to analyze potential item parameter drift (IPD) impact on person ability estimates and classification accuracy when drift affects an examinee subgroup. Using a series of simulations, three factors were manipulated: (a) percentage of IPD items in the CAT exam, (b) percentage of examinees affected by IPD, and (c) item pool…

Descriptors: Adaptive Testing, Classification, Accuracy, Computer Assisted Testing

Effects of Differential Item Functioning on Examinees' Test Performance and Reliability of Test

Peer reviewed

Direct link

Lee, Yi-Hsuan; Zhang, Jinming – International Journal of Testing, 2017

Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…

Descriptors: Test Bias, Test Reliability, Performance, Scores

Differential Item Functioning Detection with the Mantel-Haenszel Procedure: The Effects of Matching Types and Other Factors

Peer reviewed

Direct link

Socha, Alan; DeMars, Christine E.; Zilberberg, Anna; Phan, Ha – International Journal of Testing, 2015

The Mantel-Haenszel (MH) procedure is commonly used to detect items that function differentially for groups of examinees from various demographic and linguistic backgrounds--for example, in international assessments. As in some other DIF methods, the total score is used to match examinees on ability. In thin matching, each of the total score…

Descriptors: Test Items, Educational Testing, Evaluation Methods, Ability Grouping

Test Length and Decision Quality in Personnel Selection: When Is Short Too Short?

Peer reviewed

Direct link

Kruyen, Peter M.; Emons, Wilco H. M.; Sijtsma, Klaas – International Journal of Testing, 2012

Personnel selection shows an enduring need for short stand-alone tests consisting of, say, 5 to 15 items. Despite their efficiency, short tests are more vulnerable to measurement error than longer test versions. Consequently, the question arises to what extent reducing test length deteriorates decision quality due to increased impact of…

Descriptors: Measurement, Personnel Selection, Decision Making, Error of Measurement

Applying Rasch Model and Generalizability Theory to Study Modified-Angoff Cut Scores

Peer reviewed

Direct link

Arce, Alvaro J.; Wang, Ze – International Journal of Testing, 2012

The traditional approach to scale modified-Angoff cut scores transfers the raw cuts to an existing raw-to-scale score conversion table. Under the traditional approach, cut scores and conversion table raw scores are not only seen as interchangeable but also as originating from a common scaling process. In this article, we propose an alternative…

Descriptors: Generalizability Theory, Item Response Theory, Cutting Scores, Scaling

Theory of Test Translation Error

Peer reviewed

Direct link

Solano-Flores, Guillermo; Backhoff, Eduardo; Contreras-Nino, Luis Angel – International Journal of Testing, 2009

In this article, we present a theory of test translation whose intent is to provide the conceptual foundation for effective, systematic work in the process of test translation and test translation review. According to the theory, translation error is multidimensional; it is not simply the consequence of defective translation but an inevitable fact…

Descriptors: Test Items, Investigations, Semantics, Translation

Aksu Dunya, Beyza	1
Arce, Alvaro J.	1
Backhoff, Eduardo	1
Cole, Ki Lynn	1
Contreras-Nino, Luis Angel	1
DeMars, Christine E.	1
Emons, Wilco H. M.	1
James Soland	1
Kim, Sohee	1
Kruyen, Peter M.	1
Lee, Yi-Hsuan	1
Mwavita, Mwarumba	1
Phan, Ha	1
Rujun Xu	1
Sijtsma, Klaas	1
Socha, Alan	1
Solano-Flores, Guillermo	1
Wang, Ze	1
Xiaowen Liu	1
Zhang, Jinming	1
Zilberberg, Anna	1
More ▼