Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 9 |
Descriptor
Error of Measurement | 9 |
Test Items | 9 |
Item Response Theory | 5 |
Scores | 4 |
Accuracy | 2 |
Comparative Analysis | 2 |
Correlation | 2 |
Difficulty Level | 2 |
Item Analysis | 2 |
Sample Size | 2 |
Simulation | 2 |
More ▼ |
Source
International Journal of… | 9 |
Author
Aksu Dunya, Beyza | 1 |
Arce, Alvaro J. | 1 |
Backhoff, Eduardo | 1 |
Cole, Ki Lynn | 1 |
Contreras-Nino, Luis Angel | 1 |
DeMars, Christine E. | 1 |
Emons, Wilco H. M. | 1 |
James Soland | 1 |
Kim, Sohee | 1 |
Kruyen, Peter M. | 1 |
Lee, Yi-Hsuan | 1 |
More ▼ |
Publication Type
Journal Articles | 9 |
Reports - Research | 7 |
Reports - Descriptive | 1 |
Reports - Evaluative | 1 |
Education Level
Elementary Secondary Education | 1 |
Grade 3 | 1 |
Grade 5 | 1 |
Grade 7 | 1 |
Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 1 |
What Works Clearinghouse Rating
Xiaowen Liu – International Journal of Testing, 2024
Differential item functioning (DIF) often arises from multiple sources. Within the context of multidimensional item response theory, this study examined DIF items with varying secondary dimensions using the three DIF methods: SIBTEST, Mantel-Haenszel, and logistic regression. The effect of the number of secondary dimensions on DIF detection rates…
Descriptors: Item Analysis, Test Items, Item Response Theory, Correlation
Rujun Xu; James Soland – International Journal of Testing, 2024
International surveys are increasingly being used to understand nonacademic outcomes like math and science motivation, and to inform education policy changes within countries. Such instruments assume that the measure works consistently across countries, ethnicities, and languages--that is, they assume measurement invariance. While studies have…
Descriptors: Surveys, Statistical Bias, Achievement Tests, Foreign Countries
FIPC Linking across Multidimensional Test Forms: Effects of Confounding Difficulty within Dimensions
Kim, Sohee; Cole, Ki Lynn; Mwavita, Mwarumba – International Journal of Testing, 2018
This study investigated the effects of linking potentially multidimensional test forms using the fixed item parameter calibration. Forms had equal or unequal total test difficulty with and without confounding difficulty. The mean square errors and bias of estimated item and ability parameters were compared across the various confounding tests. The…
Descriptors: Test Items, Item Response Theory, Test Format, Difficulty Level
Aksu Dunya, Beyza – International Journal of Testing, 2018
This study was conducted to analyze potential item parameter drift (IPD) impact on person ability estimates and classification accuracy when drift affects an examinee subgroup. Using a series of simulations, three factors were manipulated: (a) percentage of IPD items in the CAT exam, (b) percentage of examinees affected by IPD, and (c) item pool…
Descriptors: Adaptive Testing, Classification, Accuracy, Computer Assisted Testing
Lee, Yi-Hsuan; Zhang, Jinming – International Journal of Testing, 2017
Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…
Descriptors: Test Bias, Test Reliability, Performance, Scores
Socha, Alan; DeMars, Christine E.; Zilberberg, Anna; Phan, Ha – International Journal of Testing, 2015
The Mantel-Haenszel (MH) procedure is commonly used to detect items that function differentially for groups of examinees from various demographic and linguistic backgrounds--for example, in international assessments. As in some other DIF methods, the total score is used to match examinees on ability. In thin matching, each of the total score…
Descriptors: Test Items, Educational Testing, Evaluation Methods, Ability Grouping
Kruyen, Peter M.; Emons, Wilco H. M.; Sijtsma, Klaas – International Journal of Testing, 2012
Personnel selection shows an enduring need for short stand-alone tests consisting of, say, 5 to 15 items. Despite their efficiency, short tests are more vulnerable to measurement error than longer test versions. Consequently, the question arises to what extent reducing test length deteriorates decision quality due to increased impact of…
Descriptors: Measurement, Personnel Selection, Decision Making, Error of Measurement
Arce, Alvaro J.; Wang, Ze – International Journal of Testing, 2012
The traditional approach to scale modified-Angoff cut scores transfers the raw cuts to an existing raw-to-scale score conversion table. Under the traditional approach, cut scores and conversion table raw scores are not only seen as interchangeable but also as originating from a common scaling process. In this article, we propose an alternative…
Descriptors: Generalizability Theory, Item Response Theory, Cutting Scores, Scaling
Solano-Flores, Guillermo; Backhoff, Eduardo; Contreras-Nino, Luis Angel – International Journal of Testing, 2009
In this article, we present a theory of test translation whose intent is to provide the conceptual foundation for effective, systematic work in the process of test translation and test translation review. According to the theory, translation error is multidimensional; it is not simply the consequence of defective translation but an inevitable fact…
Descriptors: Test Items, Investigations, Semantics, Translation