Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 13 |
Since 2006 (last 20 years) | 53 |
Descriptor
Test Bias | 56 |
Foreign Countries | 29 |
Test Items | 29 |
Item Response Theory | 23 |
Comparative Analysis | 16 |
Scores | 14 |
Regression (Statistics) | 13 |
Mathematics Tests | 12 |
Statistical Analysis | 10 |
Gender Differences | 9 |
Psychometrics | 9 |
More ▼ |
Source
International Journal of… | 56 |
Author
Publication Type
Journal Articles | 56 |
Reports - Research | 41 |
Reports - Evaluative | 10 |
Reports - Descriptive | 4 |
Guides - General | 1 |
Information Analyses | 1 |
Tests/Questionnaires | 1 |
Education Level
Elementary Education | 10 |
Secondary Education | 8 |
Higher Education | 6 |
Grade 4 | 5 |
Grade 8 | 4 |
Intermediate Grades | 4 |
High Schools | 3 |
Grade 7 | 2 |
Junior High Schools | 2 |
Middle Schools | 2 |
Adult Education | 1 |
More ▼ |
Audience
Practitioners | 1 |
Researchers | 1 |
Location
United States | 7 |
Canada | 4 |
Hong Kong | 4 |
Qatar | 4 |
Australia | 3 |
Singapore | 3 |
Taiwan | 3 |
United Kingdom (England) | 3 |
Iran | 2 |
Kuwait | 2 |
Turkey | 2 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Farshad Effatpanah; Purya Baghaei; Hamdollah Ravand; Olga Kunina-Habenicht – International Journal of Testing, 2025
This study applied the Mixed Rasch Model (MRM) to the listening comprehension section of the International English Language Testing System (IELTS) to detect latent class differential item functioning (DIF) by exploring multiple profiles of second/foreign language listeners. Item responses of 462 examinees to an IELTS listening test were subjected…
Descriptors: Item Response Theory, Second Language Learning, Listening Comprehension, English (Second Language)
Liou, Gloria; Bonner, Cavan V.; Tay, Louis – International Journal of Testing, 2022
With the advent of big data and advances in technology, psychological assessments have become increasingly sophisticated and complex. Nevertheless, traditional psychometric issues concerning the validity, reliability, and measurement bias of such assessments remain fundamental in determining whether score inferences of human attributes are…
Descriptors: Psychometrics, Computer Assisted Testing, Adaptive Testing, Data
Huggins-Manley, Anne Corinne; Qiu, Yuxi; Penfield, Randall D. – International Journal of Testing, 2018
Score equity assessment (SEA) refers to an examination of population invariance of equating across two or more subpopulations of test examinees. Previous SEA studies have shown that score equity may be present for examinees scoring at particular test score ranges but absent for examinees scoring at other score ranges. No studies to date have…
Descriptors: Equated Scores, Test Bias, Test Items, Difficulty Level
Roberson, Nathan D.; Zumbo, Bruno D. – International Journal of Testing, 2019
This paper investigates measurement invariance as it relates to migration background using the Program for International Student Assessment measure of social belonging. We explore how the use of two measurement invariance techniques provide insights into differential item functioning using the alignment method in conjunction with logistic…
Descriptors: Achievement Tests, Foreign Countries, International Assessment, Secondary School Students
Tsaousis, Ioannis; Sideridis, Georgios; Al-Saawi, Fahad – International Journal of Testing, 2018
The aim of the present study was to examine Differential Distractor Functioning (DDF) as a means of improving the quality of a measure through understanding biased responses across groups. A DDF analysis could shed light on the potential sources of construct-irrelevant variance by examining whether the differential selection of incorrect choices…
Descriptors: Foreign Countries, College Entrance Examinations, Test Bias, Chemistry
Pishghadam, Reza; Baghaei, Purya; Seyednozadi, Zahra – International Journal of Testing, 2017
This article attempts to present emotioncy as a potential source of test bias to inform the analysis of test item performance. Emotioncy is defined as a hierarchy, ranging from "exvolvement" (auditory, visual, and kinesthetic) to "involvement" (inner and arch), to emphasize the emotions evoked by the senses. This study…
Descriptors: Test Bias, Item Response Theory, Test Items, Psychological Patterns
Lee, Yi-Hsuan; Zhang, Jinming – International Journal of Testing, 2017
Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…
Descriptors: Test Bias, Test Reliability, Performance, Scores
Tay, Louis; Vermunt, Jeroen K.; Wang, Chun – International Journal of Testing, 2013
We evaluate the item response theory with covariates (IRT-C) procedure for assessing differential item functioning (DIF) without preknowledge of anchor items (Tay, Newman, & Vermunt, 2011). This procedure begins with a fully constrained baseline model, and candidate items are tested for uniform and/or nonuniform DIF using the Wald statistic.…
Descriptors: Item Response Theory, Test Bias, Models, Statistical Analysis
Oliveri, María Elena; von Davier, Alina A. – International Journal of Testing, 2016
In this study, we propose that the unique needs and characteristics of linguistic minorities should be considered throughout the test development process. Unlike most measurement invariance investigations in the assessment of linguistic minorities, which typically are conducted after test administration, we propose strategies that focus on the…
Descriptors: Psychometrics, Linguistics, Test Construction, Testing
Rutkowski, Leslie; Rutkowski, David; Zhou, Yan – International Journal of Testing, 2016
Using an empirically-based simulation study, we show that typically used methods of choosing an item calibration sample have significant impacts on achievement bias and system rankings. We examine whether recent PISA accommodations, especially for lower performing participants, can mitigate some of this bias. Our findings indicate that standard…
Descriptors: Simulation, International Programs, Adolescents, Student Evaluation
Ercikan, Kadriye; Chen, Michelle Y.; Lyons-Thomas, Juliette; Goodrich, Shawna; Sandilands, Debra; Roth, Wolff-Michael; Simon, Marielle – International Journal of Testing, 2015
The purpose of this research is to examine the comparability of mathematics and science scores for students from English language backgrounds (ELB) and non-English language backgrounds (NELB). We examine the relationship between English reading proficiency and performance on mathematics and science assessments in Australia, Canada, the United…
Descriptors: Scores, Mathematics Tests, Science Tests, Native Speakers
Oshima, T. C.; Wright, Keith; White, Nick – International Journal of Testing, 2015
Raju, van der Linden, and Fleer (1995) introduced a framework for differential functioning of items and tests (DFIT) for unidimensional dichotomous models. Since then, DFIT has been shown to be a quite versatile framework as it can handle polytomous as well as multidimensional models both at the item and test levels. However, DFIT is still limited…
Descriptors: Test Bias, Item Response Theory, Test Items, Simulation
Oliveri, María Elena; Ercikan, Kadriye; Zumbo, Bruno D.; Lawless, René – International Journal of Testing, 2014
In this study, we contrast results from two differential item functioning (DIF) approaches (manifest and latent class) by the number of items and sources of items identified as DIF using data from an international reading assessment. The latter approach yielded three latent classes, presenting evidence of heterogeneity in examinee response…
Descriptors: Test Bias, Comparative Analysis, Reading Tests, Effect Size
Socha, Alan; DeMars, Christine E.; Zilberberg, Anna; Phan, Ha – International Journal of Testing, 2015
The Mantel-Haenszel (MH) procedure is commonly used to detect items that function differentially for groups of examinees from various demographic and linguistic backgrounds--for example, in international assessments. As in some other DIF methods, the total score is used to match examinees on ability. In thin matching, each of the total score…
Descriptors: Test Items, Educational Testing, Evaluation Methods, Ability Grouping
Lee, HyeSun; Geisinger, Kurt F. – International Journal of Testing, 2014
Differential item functioning (DIF) analysis is important in terms of test fairness. While DIF analyses have mainly been conducted with manifest grouping variables, such as gender or race/ethnicity, it has been recently claimed that not only the grouping variables but also contextual variables pertaining to examinees should be considered in DIF…
Descriptors: Test Bias, Gender Differences, Regression (Statistics), Statistical Analysis