Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 19 |
Descriptor
Comparative Analysis | 19 |
Computation | 19 |
Test Bias | 19 |
Test Items | 9 |
Item Response Theory | 8 |
Models | 8 |
Statistical Analysis | 8 |
Accuracy | 6 |
Sample Size | 5 |
Simulation | 5 |
Difficulty Level | 4 |
More ▼ |
Source
Author
Dorans, Neil J. | 2 |
Miao, Jing | 2 |
Moses, Tim | 2 |
Strobl, Carolin | 2 |
Zeileis, Achim | 2 |
Beland, Sebastien | 1 |
Bovaird, James A. | 1 |
Brooks, Gordon | 1 |
Cai, Li | 1 |
Cho, Sun-Joo | 1 |
Diaz, Emily | 1 |
More ▼ |
Publication Type
Journal Articles | 18 |
Reports - Research | 14 |
Reports - Evaluative | 4 |
Reports - Descriptive | 1 |
Education Level
Elementary Secondary Education | 1 |
Higher Education | 1 |
Secondary Education | 1 |
Audience
Location
Canada | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 2 |
Progress in International… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Diaz, Emily; Brooks, Gordon; Johanson, George – International Journal of Assessment Tools in Education, 2021
This Monte Carlo study assessed Type I error in differential item functioning analyses using Lord's chi-square (LC), Likelihood Ratio Test (LRT), and Mantel-Haenszel (MH) procedure. Two research interests were investigated: item response theory (IRT) model specification in LC and the LRT and continuity correction in the MH procedure. This study…
Descriptors: Test Bias, Item Response Theory, Statistical Analysis, Comparative Analysis
Komboz, Basil; Strobl, Carolin; Zeileis, Achim – Educational and Psychological Measurement, 2018
Psychometric measurement models are only valid if measurement invariance holds between test takers of different groups. Global model tests, such as the well-established likelihood ratio (LR) test, are sensitive to violations of measurement invariance, such as differential item functioning and differential step functioning. However, these…
Descriptors: Item Response Theory, Models, Tests, Measurement
Sachse, Karoline A.; Haag, Nicole – Applied Measurement in Education, 2017
Standard errors computed according to the operational practices of international large-scale assessment studies such as the Programme for International Student Assessment's (PISA) or the Trends in International Mathematics and Science Study (TIMSS) may be biased when cross-national differential item functioning (DIF) and item parameter drift are…
Descriptors: Error of Measurement, Test Bias, International Assessment, Computation
Cho, Sun-Joo; Suh, Youngsuk; Lee, Woo-yeol – Educational Measurement: Issues and Practice, 2016
The purpose of this ITEMS module is to provide an introduction to differential item functioning (DIF) analysis using mixture item response models. The mixture item response models for DIF analysis involve comparing item profiles across latent groups, instead of manifest groups. First, an overview of DIF analysis based on latent groups, called…
Descriptors: Test Bias, Research Methodology, Evaluation Methods, Models
Matlock, Ki Lynn; Turner, Ronna – Educational and Psychological Measurement, 2016
When constructing multiple test forms, the number of items and the total test difficulty are often equivalent. Not all test developers match the number of items and/or average item difficulty within subcontent areas. In this simulation study, six test forms were constructed having an equal number of items and average item difficulty overall.…
Descriptors: Item Response Theory, Computation, Test Items, Difficulty Level
Kim, Sooyeon; Robin, Frederic – ETS Research Report Series, 2017
In this study, we examined the potential impact of item misfit on the reported scores of an admission test from the subpopulation invariance perspective. The target population of the test consisted of 3 major subgroups with different geographic regions. We used the logistic regression function to estimate item parameters of the operational items…
Descriptors: Scores, Test Items, Test Bias, International Assessment
Koziol, Natalie A.; Bovaird, James A. – Educational and Psychological Measurement, 2018
Evaluations of measurement invariance provide essential construct validity evidence--a prerequisite for seeking meaning in psychological and educational research and ensuring fair testing procedures in high-stakes settings. However, the quality of such evidence is partly dependent on the validity of the resulting statistical conclusions. Type I or…
Descriptors: Computation, Tests, Error of Measurement, Comparative Analysis
Frick, Hannah; Strobl, Carolin; Zeileis, Achim – Educational and Psychological Measurement, 2015
Rasch mixture models can be a useful tool when checking the assumption of measurement invariance for a single Rasch model. They provide advantages compared to manifest differential item functioning (DIF) tests when the DIF groups are only weakly correlated with the manifest covariates available. Unlike in single Rasch models, estimation of Rasch…
Descriptors: Item Response Theory, Test Bias, Comparative Analysis, Scores
Pohl, Steffi – Journal of Educational Measurement, 2013
This article introduces longitudinal multistage testing (lMST), a special form of multistage testing (MST), as a method for adaptive testing in longitudinal large-scale studies. In lMST designs, test forms of different difficulty levels are used, whereas the values on a pretest determine the routing to these test forms. Since lMST allows for…
Descriptors: Adaptive Testing, Longitudinal Studies, Difficulty Level, Comparative Analysis
Woods, Carol M.; Cai, Li; Wang, Mian – Educational and Psychological Measurement, 2013
Differential item functioning (DIF) occurs when the probability of responding in a particular category to an item differs for members of different groups who are matched on the construct being measured. The identification of DIF is important for valid measurement. This research evaluates an improved version of Lord's X[superscript 2] Wald test for…
Descriptors: Test Bias, Item Response Theory, Computation, Comparative Analysis
Oliveri, Maria Elena; von Davier, Matthias – International Journal of Testing, 2014
In this article, we investigate the creation of comparable score scales across countries in international assessments. We examine potential improvements to current score scale calibration procedures used in international large-scale assessments. Our approach seeks to improve fairness in scoring international large-scale assessments, which often…
Descriptors: Test Bias, Scores, International Programs, Educational Assessment
Moses, Tim; Miao, Jing; Dorans, Neil J. – Journal of Educational and Behavioral Statistics, 2010
In this study, the accuracies of four strategies were compared for estimating conditional differential item functioning (DIF), including raw data, logistic regression, log-linear models, and kernel smoothing. Real data simulations were used to evaluate the estimation strategies across six items, DIF and No DIF situations, and four sample size…
Descriptors: Test Bias, Statistical Analysis, Computation, Comparative Analysis
Wang, Zhen; Yao, Lihua – ETS Research Report Series, 2013
The current study used simulated data to investigate the properties of a newly proposed method (Yao's rater model) for modeling rater severity and its distribution under different conditions. Our study examined the effects of rater severity, distributions of rater severity, the difference between item response theory (IRT) models with rater effect…
Descriptors: Test Format, Test Items, Responses, Computation
Jiao, Hong; Wang, Shudong; He, Wei – Journal of Educational Measurement, 2013
This study demonstrated the equivalence between the Rasch testlet model and the three-level one-parameter testlet model and explored the Markov Chain Monte Carlo (MCMC) method for model parameter estimation in WINBUGS. The estimation accuracy from the MCMC method was compared with those from the marginalized maximum likelihood estimation (MMLE)…
Descriptors: Computation, Item Response Theory, Models, Monte Carlo Methods
Magis, David; Raiche, Gilles; Beland, Sebastien; Gerard, Paul – International Journal of Testing, 2011
We present an extension of the logistic regression procedure to identify dichotomous differential item functioning (DIF) in the presence of more than two groups of respondents. Starting from the usual framework of a single focal group, we propose a general approach to estimate the item response functions in each group and to test for the presence…
Descriptors: Language Skills, Identification, Foreign Countries, Evaluation Methods
Previous Page | Next Page ยป
Pages: 1 | 2