Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 7 |
Since 2016 (last 10 years) | 52 |
Since 2006 (last 20 years) | 138 |
Descriptor
Comparative Analysis | 229 |
Test Bias | 229 |
Test Items | 92 |
Statistical Analysis | 56 |
Foreign Countries | 53 |
Item Response Theory | 50 |
Scores | 45 |
Test Validity | 34 |
Item Analysis | 29 |
Mathematics Tests | 26 |
Achievement Tests | 25 |
More ▼ |
Source
Author
Ercikan, Kadriye | 8 |
Oliveri, Maria Elena | 5 |
Zumbo, Bruno D. | 5 |
Finch, W. Holmes | 4 |
Jin, Ying | 4 |
Kim, Sooyeon | 3 |
Laitusis, Cara Cahalan | 3 |
Lyons-Thomas, Juliette | 3 |
Robin, Frederic | 3 |
Sireci, Stephen G. | 3 |
Strobl, Carolin | 3 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 3 |
Policymakers | 1 |
Location
Canada | 9 |
Turkey | 8 |
United States | 6 |
Australia | 5 |
Germany | 4 |
Norway | 3 |
Singapore | 3 |
Taiwan | 3 |
United Kingdom | 3 |
United Kingdom (England) | 3 |
Austria | 2 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 2 |
Rehabilitation Act 1973… | 1 |
Social Security | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Leala Holcomb; Wyatte C. Hall; Stephanie J. Gardiner-Walsh; Jessica Scott – Journal of Deaf Studies and Deaf Education, 2025
This study critically examines the biases and methodological shortcomings in studies comparing deaf and hearing populations, demonstrating their implications for both the reliability and ethics of research in deaf education. Upon reviewing the 20 most-cited deaf-hearing comparison studies, we identified recurring fallacies such as the presumption…
Descriptors: Literature Reviews, Deafness, Social Bias, Test Bias
Robitzsch, Alexander; Lüdtke, Oliver – Journal of Educational and Behavioral Statistics, 2022
One of the primary goals of international large-scale assessments in education is the comparison of country means in student achievement. This article introduces a framework for discussing differential item functioning (DIF) for such mean comparisons. We compare three different linking methods: concurrent scaling based on full invariance,…
Descriptors: Test Bias, International Assessment, Scaling, Comparative Analysis
Diaz, Emily; Brooks, Gordon; Johanson, George – International Journal of Assessment Tools in Education, 2021
This Monte Carlo study assessed Type I error in differential item functioning analyses using Lord's chi-square (LC), Likelihood Ratio Test (LRT), and Mantel-Haenszel (MH) procedure. Two research interests were investigated: item response theory (IRT) model specification in LC and the LRT and continuity correction in the MH procedure. This study…
Descriptors: Test Bias, Item Response Theory, Statistical Analysis, Comparative Analysis
Soysal, Sumeyra; Yilmaz Kogar, Esin – International Journal of Assessment Tools in Education, 2021
In this study, whether item position effects lead to DIF in the condition where different test booklets are used was investigated. To do this the methods of Lord's chi-square and Raju's unsigned area with the 3PL model under with and without item purification were used. When the performance of the methods was compared, it was revealed that…
Descriptors: Item Response Theory, Test Bias, Test Items, Comparative Analysis
Zwick, Rebecca; Ye, Lei; Isham, Steven – Journal of Educational Measurement, 2018
In typical differential item functioning (DIF) assessments, an item's DIF status is not influenced by its status in previous test administrations. An item that has shown DIF at multiple administrations may be treated the same way as an item that has shown DIF in only the most recent administration. Therefore, much useful information about the…
Descriptors: Test Bias, Testing, Test Items, Bayesian Statistics
Kuang, Huan; Sahin, Fusun – Large-scale Assessments in Education, 2023
Background: Examinees may not make enough effort when responding to test items if the assessment has no consequence for them. These disengaged responses can be problematic in low-stakes, large-scale assessments because they can bias item parameter estimates. However, the amount of bias, and whether this bias is similar across administrations, is…
Descriptors: Test Items, Comparative Analysis, Mathematics Tests, Reaction Time
Ames, Allison J. – Educational and Psychological Measurement, 2022
Individual response style behaviors, unrelated to the latent trait of interest, may influence responses to ordinal survey items. Response style can introduce bias in the total score with respect to the trait of interest, threatening valid interpretation of scores. Despite claims of response style stability across scales, there has been little…
Descriptors: Response Style (Tests), Individual Differences, Scores, Test Items
Russell, Michael; Szendey, Olivia; Li, Zhushan – Educational Assessment, 2022
Recent research provides evidence that an intersectional approach to defining reference and focal groups results in a higher percentage of comparisons flagged for potential DIF. The study presented here examined the generalizability of this pattern across methods for examining DIF. While the level of DIF detection differed among the four methods…
Descriptors: Comparative Analysis, Item Analysis, Test Items, Test Construction
Bundsgaard, Jeppe – Large-scale Assessments in Education, 2019
International large-scale assessments like international computer and information literacy study (ICILS) (Fraillon et al. in International Association for the Evaluation of Educational Achievement (IEA), 2015) provide important empirically-based knowledge through the proficiency scales, of what characterizes tasks at different difficulty levels,…
Descriptors: Test Bias, International Assessment, Test Items, Difficulty Level
Inal, Hatice; Anil, Duygu – Eurasian Journal of Educational Research, 2018
Purpose: This study aimed to examine the impact of differential item functioning in anchor items on the group invariance in test equating for different sample sizes. Within this scope, the factors chosen to investigate the group invariance in test equating were sample size, frequency of sample size of subgroups, differential form of differential…
Descriptors: Equated Scores, Test Bias, Test Items, Sample Size
Uyar, Seyma – Eurasian Journal of Educational Research, 2020
Purpose: This study aimed to compare the performance of latent class differential item functioning (DIF) approach and IRT based DIF methods using manifest grouping. With this study, it was thought to draw attention to carry out latent class DIF studies in Turkey. The purpose of this study was to examine DIF in PISA 2015 science data set. Research…
Descriptors: Item Response Theory, Foreign Countries, Cross Cultural Studies, Item Analysis
Komboz, Basil; Strobl, Carolin; Zeileis, Achim – Educational and Psychological Measurement, 2018
Psychometric measurement models are only valid if measurement invariance holds between test takers of different groups. Global model tests, such as the well-established likelihood ratio (LR) test, are sensitive to violations of measurement invariance, such as differential item functioning and differential step functioning. However, these…
Descriptors: Item Response Theory, Models, Tests, Measurement
Trundt, Katherine M.; Keith, Timothy Z.; Caemmerer, Jacqueline M.; Smith, Leann V. – Journal of Psychoeducational Assessment, 2018
Individually administered intelligence measures are commonly used in diagnostic work, but there is a continuing need for research investigating possible test bias among these measures. One current intelligence measure, the Differential Ability Scales, Second Edition (DAS-II), is a test with growing popularity. The issue of test bias, however, has…
Descriptors: Test Bias, Intelligence Tests, Children, African American Children
Sachse, Karoline A.; Haag, Nicole – Applied Measurement in Education, 2017
Standard errors computed according to the operational practices of international large-scale assessment studies such as the Programme for International Student Assessment's (PISA) or the Trends in International Mathematics and Science Study (TIMSS) may be biased when cross-national differential item functioning (DIF) and item parameter drift are…
Descriptors: Error of Measurement, Test Bias, International Assessment, Computation
Ercikan, Kadriye; Guo, Hongwen; He, Qiwei – Educational Assessment, 2020
Comparing group is one of the key uses of large-scale assessment results, which are used to gain insights to inform policy and practice and to examine the comparability of scores and score meaning. Such comparisons typically focus on examinees' final answers and responses to test questions, ignoring response process differences groups may engage…
Descriptors: Data Use, Responses, Comparative Analysis, Test Bias