Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 33 |
Descriptor
Simulation | 36 |
Statistical Analysis | 36 |
Test Bias | 36 |
Test Items | 18 |
Item Response Theory | 16 |
Sample Size | 13 |
Error of Measurement | 11 |
Evaluation Methods | 8 |
Comparative Analysis | 7 |
Item Analysis | 7 |
Computation | 6 |
More ▼ |
Source
Author
De Boeck, Paul | 2 |
Finch, W. Holmes | 2 |
French, Brian F. | 2 |
Magis, David | 2 |
Paek, Insu | 2 |
Penfield, Randall D. | 2 |
Tay, Louis | 2 |
Zwick, Rebecca | 2 |
Ali, Usama S. | 1 |
Ames, Allison | 1 |
Arsan, Nihan | 1 |
More ▼ |
Publication Type
Journal Articles | 33 |
Reports - Research | 31 |
Reports - Evaluative | 4 |
Dissertations/Theses -… | 1 |
Numerical/Quantitative Data | 1 |
Education Level
Higher Education | 2 |
Postsecondary Education | 2 |
Elementary Education | 1 |
Elementary Secondary Education | 1 |
Grade 4 | 1 |
High Schools | 1 |
Intermediate Grades | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Florida Comprehensive… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Weese, James D.; Turner, Ronna C.; Liang, Xinya; Ames, Allison; Crawford, Brandon – Educational and Psychological Measurement, 2023
A study was conducted to implement the use of a standardized effect size and corresponding classification guidelines for polytomous data with the POLYSIBTEST procedure and compare those guidelines with prior recommendations. Two simulation studies were included. The first identifies new unstandardized test heuristics for classifying moderate and…
Descriptors: Effect Size, Classification, Guidelines, Statistical Analysis
Paul J. Walter; Edward Nuhfer; Crisel Suarez – Numeracy, 2021
We introduce an approach for making a quantitative comparison of the item response curves (IRCs) of any two populations on a multiple-choice test instrument. In this study, we employ simulated and actual data. We apply our approach to a dataset of 12,187 participants on the 25-item Science Literacy Concept Inventory (SLCI), which includes ample…
Descriptors: Item Analysis, Multiple Choice Tests, Simulation, Data Analysis
Dimitrov, Dimiter M. – Measurement and Evaluation in Counseling and Development, 2017
This article offers an approach to examining differential item functioning (DIF) under its item response theory (IRT) treatment in the framework of confirmatory factor analysis (CFA). The approach is based on integrating IRT- and CFA-based testing of DIF and using bias-corrected bootstrap confidence intervals with a syntax code in Mplus.
Descriptors: Test Bias, Item Response Theory, Factor Analysis, Evaluation Methods
Magis, David; De Boeck, Paul – Educational and Psychological Measurement, 2014
It is known that sum score-based methods for the identification of differential item functioning (DIF), such as the Mantel-Haenszel (MH) approach, can be affected by Type I error inflation in the absence of any DIF effect. This may happen when the items differ in discrimination and when there is item impact. On the other hand, outlier DIF methods…
Descriptors: Test Bias, Statistical Analysis, Test Items, Simulation
Hidalgo, Ma Dolores; Benítez, Isabel; Padilla, Jose-Luis; Gómez-Benito, Juana – Sociological Methods & Research, 2017
The growing use of scales in survey questionnaires warrants the need to address how does polytomous differential item functioning (DIF) affect observed scale score comparisons. The aim of this study is to investigate the impact of DIF on the type I error and effect size of the independent samples t-test on the observed total scale scores. A…
Descriptors: Test Items, Test Bias, Item Response Theory, Surveys
Tay, Louis; Huang, Qiming; Vermunt, Jeroen K. – Educational and Psychological Measurement, 2016
In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To…
Descriptors: Item Response Theory, Test Bias, Simulation, College Entrance Examinations
Jin, Ying; Eason, Hershel – Journal of Educational Issues, 2016
The effects of mean ability difference (MAD) and short tests on the performance of various DIF methods have been studied extensively in previous simulation studies. Their effects, however, have not been studied under multilevel data structure. MAD was frequently observed in large-scale cross-country comparison studies where the primary sampling…
Descriptors: Test Bias, Simulation, Hierarchical Linear Modeling, Comparative Analysis
Li, Zhushan – Journal of Educational Measurement, 2014
Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…
Descriptors: Test Bias, Sample Size, Statistical Analysis, Regression (Statistics)
Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2017
This study defines subpopulation item parameter drift (SIPD) as a change in item parameters over time that is dependent on subpopulations of examinees, and hypothesizes that the presence of SIPD in anchor items is associated with bias and/or lack of invariance in three psychometric outcomes. Results show that SIPD in anchor items is associated…
Descriptors: Psychometrics, Test Items, Item Response Theory, Hypothesis Testing
Zwick, Rebecca; Ye, Lei; Isham, Steven – ETS Research Report Series, 2013
Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. Although it is often assumed that refinement of the matching criterion always provides more accurate DIF results, the actual situation proves to be more complex. To explore the effectiveness of refinement, we…
Descriptors: Test Bias, Statistical Analysis, Simulation, Educational Testing
Wang, Wei; Tay, Louis; Drasgow, Fritz – Applied Psychological Measurement, 2013
There has been growing use of ideal point models to develop scales measuring important psychological constructs. For meaningful comparisons across groups, it is important to identify items on such scales that exhibit differential item functioning (DIF). In this study, the authors examined several methods for assessing DIF on polytomous items…
Descriptors: Test Bias, Effect Size, Item Response Theory, Statistical Analysis
Kim, Jihye; Oshima, T. C. – Educational and Psychological Measurement, 2013
In a typical differential item functioning (DIF) analysis, a significance test is conducted for each item. As a test consists of multiple items, such multiple testing may increase the possibility of making a Type I error at least once. The goal of this study was to investigate how to control a Type I error rate and power using adjustment…
Descriptors: Test Bias, Test Items, Statistical Analysis, Error of Measurement
Atalay Kabasakal, Kübra; Arsan, Nihan; Gök, Bilge; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2014
This simulation study compared the performances (Type I error and power) of Mantel-Haenszel (MH), SIBTEST, and item response theory-likelihood ratio (IRT-LR) methods under certain conditions. Manipulated factors were sample size, ability differences between groups, test length, the percentage of differential item functioning (DIF), and underlying…
Descriptors: Comparative Analysis, Item Response Theory, Statistical Analysis, Test Bias
Woods, Carol M.; Cai, Li; Wang, Mian – Educational and Psychological Measurement, 2013
Differential item functioning (DIF) occurs when the probability of responding in a particular category to an item differs for members of different groups who are matched on the construct being measured. The identification of DIF is important for valid measurement. This research evaluates an improved version of Lord's X[superscript 2] Wald test for…
Descriptors: Test Bias, Item Response Theory, Computation, Comparative Analysis
Beretvas, S. Natasha; Walker, Cindy M. – Educational and Psychological Measurement, 2012
This study extends the multilevel measurement model to handle testlet-based dependencies. A flexible two-level testlet response model (the MMMT-2 model) for dichotomous items is introduced that permits assessment of differential testlet functioning (DTLF). A distinction is made between this study's conceptualization of DTLF and that of…
Descriptors: Test Bias, Simulation, Test Items, Item Response Theory