ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	10
Since 2006 (last 20 years)	25

Descriptor

Computation	27
Statistical Analysis	27
Test Bias	27
Item Response Theory	9
Comparative Analysis	8
Sample Size	7
Scores	6
Simulation	6
Bayesian Statistics	5
Error of Measurement	5
Evaluation Methods	5
Test Items	5
Maximum Likelihood Statistics	4
Models	4
Regression (Statistics)	4
Accuracy	3
Effect Size	3
Factor Analysis	3
Goodness of Fit	3
Tests	3
Achievement Tests	2
Classification	2
Data Analysis	2
Foreign Countries	2
Geometric Concepts	2
More ▼

Source

Educational and Psychological…	8
ETS Research Report Series	4
Journal of Educational and…	4
Applied Psychological…	2
International Journal of…	2
Journal of Experimental…	2
Educational Testing Service	1
Journal of the American…	1
Measurement and Evaluation in…	1
Psychometrika	1
Structural Equation Modeling:…	1
More ▼

Publication Type

Journal Articles	26
Reports - Research	21
Reports - Evaluative	3
Reports - Descriptive	2
Opinion Papers	1

Education Level

Elementary Education	2
Intermediate Grades	2
Early Childhood Education	1
Grade 4	1
Grade 5	1
Higher Education	1
Postsecondary Education	1
Preschool Education	1
Secondary Education	1

Audience

Location

Australia	1
Brazil	1
New Zealand	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	1
Pre Professional Skills Tests	1
Program for International…	1
Woodcock Johnson Tests of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 27 results Save | Export

Generalized Mantel-Haenszel Estimators for Simultaneous Differential Item Functioning Tests

Peer reviewed

Direct link

Liu, Ivy; Suesse, Thomas; Harvey, Samuel; Gu, Peter Yongqi; Fernández, Daniel; Randal, John – Educational and Psychological Measurement, 2023

The Mantel-Haenszel estimator is one of the most popular techniques for measuring differential item functioning (DIF). A generalization of this estimator is applied to the context of DIF to compare items by taking the covariance of odds ratio estimators between dependent items into account. Unlike the Item Response Theory, the method does not rely…

Descriptors: Test Bias, Computation, Statistical Analysis, Achievement Tests

Detecting Differential Item Functioning: Item Response Theory Methods versus the Mantel-Haenszel Procedure

Peer reviewed
PDF on ERIC

Download full text

Diaz, Emily; Brooks, Gordon; Johanson, George – International Journal of Assessment Tools in Education, 2021

This Monte Carlo study assessed Type I error in differential item functioning analyses using Lord's chi-square (LC), Likelihood Ratio Test (LRT), and Mantel-Haenszel (MH) procedure. Two research interests were investigated: item response theory (IRT) model specification in LC and the LRT and continuity correction in the MH procedure. This study…

Descriptors: Test Bias, Item Response Theory, Statistical Analysis, Comparative Analysis

Observed Scores as Matching Variables in Differential Item Functioning under the One- and Two-Parameter Logistic Models: Population Results. Research Report. ETS RR-19-06

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2019

We derive formulas for the differential item functioning (DIF) measures that two routinely used DIF statistics are designed to estimate. The DIF measures that match on observed scores are compared to DIF measures based on an unobserved ability (theta or true score) for items that are described by either the one-parameter logistic (1PL) or…

Descriptors: Scores, Test Bias, Statistical Analysis, Item Response Theory

The Impact of Ignoring Multilevel Data Structure on the Estimation of Dichotomous Item Response Theory Models

Peer reviewed
PDF on ERIC

Download full text

Lee, Hyung Rock; Lee, Sunbok; Sung, Jaeyun – International Journal of Assessment Tools in Education, 2019

Applying single-level statistical models to multilevel data typically produces underestimated standard errors, which may result in misleading conclusions. This study examined the impact of ignoring multilevel data structure on the estimation of item parameters and their standard errors of the Rasch, two-, and three-parameter logistic models in…

Descriptors: Item Response Theory, Computation, Error of Measurement, Test Bias

Investigating Measurement Invariance by Means of Parameter Instability Tests for 2PL and 3PL Models

Peer reviewed

Direct link

Debelak, Rudolf; Strobl, Carolin – Educational and Psychological Measurement, 2019

M-fluctuation tests are a recently proposed method for detecting differential item functioning in Rasch models. This article discusses a generalization of this method to two additional item response theory models: the two-parametric logistic model and the three-parametric logistic model with a common guessing parameter. The Type I error rate and…

Descriptors: Test Bias, Item Response Theory, Statistical Analysis, Maximum Likelihood Statistics

Tree-Based Global Model Tests for Polytomous Rasch Models

Peer reviewed

Direct link

Komboz, Basil; Strobl, Carolin; Zeileis, Achim – Educational and Psychological Measurement, 2018

Psychometric measurement models are only valid if measurement invariance holds between test takers of different groups. Global model tests, such as the well-established likelihood ratio (LR) test, are sensitive to violations of measurement invariance, such as differential item functioning and differential step functioning. However, these…

Descriptors: Item Response Theory, Models, Tests, Measurement

Examining Differential Item Functioning: IRT-Based Detection in the Framework of Confirmatory Factor Analysis

Peer reviewed

Direct link

Dimitrov, Dimiter M. – Measurement and Evaluation in Counseling and Development, 2017

This article offers an approach to examining differential item functioning (DIF) under its item response theory (IRT) treatment in the framework of confirmatory factor analysis (CFA). The approach is based on integrating IRT- and CFA-based testing of DIF and using bias-corrected bootstrap confidence intervals with a syntax code in Mplus.

Descriptors: Test Bias, Item Response Theory, Factor Analysis, Evaluation Methods

An Empirical Investigation of the Potential Impact of Item Misfit on Test Scores. Research Report. ETS RR-17-60

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Robin, Frederic – ETS Research Report Series, 2017

In this study, we examined the potential impact of item misfit on the reported scores of an admission test from the subpopulation invariance perspective. The target population of the test consisted of 3 major subgroups with different geographic regions. We used the logistic regression function to estimate item parameters of the operational items…

Descriptors: Scores, Test Items, Test Bias, International Assessment

The Impact of Model Parameterization and Estimation Methods on Tests of Measurement Invariance with Ordered Polytomous Data

Peer reviewed
PDF on ERIC

Download full text

Direct link

Koziol, Natalie A.; Bovaird, James A. – Educational and Psychological Measurement, 2018

Evaluations of measurement invariance provide essential construct validity evidence--a prerequisite for seeking meaning in psychological and educational research and ensuring fair testing procedures in high-stakes settings. However, the quality of such evidence is partly dependent on the validity of the resulting statistical conclusions. Type I or…

Descriptors: Computation, Tests, Error of Measurement, Comparative Analysis

An Examination of Construct Validity for the EARLI Numeracy Skill Measures

Peer reviewed

Direct link

Cheng, Weiyi; Lei, Pui-Wa; DiPerna, James C. – Journal of Experimental Education, 2017

The purpose of the current study was to examine dimensionality and concurrent validity evidence of the EARLI numeracy measures (DiPerna, Morgan, & Lei, 2007), which were developed to assess key skills such as number identification, counting, and basic arithmetic. Two methods (NOHARM with approximate chi-square test and DIMTEST with DETECT…

Descriptors: Construct Validity, Numeracy, Mathematics Tests, Statistical Analysis

Small-Sample DIF Estimation Using SIBTEST, Cochran's Z, and Log-Linear Smoothing

Peer reviewed

Direct link

Lei, Pui-Wa; Li, Hongli – Applied Psychological Measurement, 2013

Minimum sample sizes of about 200 to 250 per group are often recommended for differential item functioning (DIF) analyses. However, there are times when sample sizes for one or both groups of interest are smaller than 200 due to practical constraints. This study attempts to examine the performance of Simultaneous Item Bias Test (SIBTEST),…

Descriptors: Sample Size, Test Bias, Computation, Accuracy

The Langer-Improved Wald Test for DIF Testing with Multiple Groups: Evaluation and Comparison to Two-Group IRT

Peer reviewed

Direct link

Woods, Carol M.; Cai, Li; Wang, Mian – Educational and Psychological Measurement, 2013

Differential item functioning (DIF) occurs when the probability of responding in a particular category to an item differs for members of different groups who are matched on the construct being measured. The identification of DIF is important for valid measurement. This research evaluates an improved version of Lord's X[superscript 2] Wald test for…

Descriptors: Test Bias, Item Response Theory, Computation, Comparative Analysis

Improving Mantel-Haenszel DIF Estimation through Bayesian Updating

Peer reviewed

Direct link

Zwick, Rebecca; Ye, Lei; Isham, Steven – Journal of Educational and Behavioral Statistics, 2012

This study demonstrates how the stability of Mantel-Haenszel (MH) DIF (differential item functioning) methods can be improved by integrating information across multiple test administrations using Bayesian updating (BU). The authors conducted a simulation that showed that this approach, which is based on earlier work by Zwick, Thayer, and Lewis,…

Descriptors: Test Bias, Computation, Statistical Analysis, Bayesian Statistics

Explanatory Secondary Dimension Modeling of Latent Differential Item Functioning

Peer reviewed

Direct link

De Boeck, Paul; Cho, Sun-Joo; Wilson, Mark – Applied Psychological Measurement, 2011

The models used in this article are secondary dimension mixture models with the potential to explain differential item functioning (DIF) between latent classes, called latent DIF. The focus is on models with a secondary dimension that is at the same time specific to the DIF latent class and linked to an item property. A description of the models…

Descriptors: Test Bias, Models, Statistical Analysis, Computation

Two Simple Approaches to Overcome a Problem with the Mantel-Haenszel Statistic: Comments on Wang, Bradlow, Wainer, and Muller (2008)

Peer reviewed

Direct link

Sinharay, Sandip; Dorans, Neil J. – Journal of Educational and Behavioral Statistics, 2010

The Mantel-Haenszel (MH) procedure (Mantel and Haenszel) is a popular method for estimating and testing a common two-factor association parameter in a 2 x 2 x K table. Holland and Holland and Thayer described how to use the procedure to detect differential item functioning (DIF) for tests with dichotomously scored items. Wang, Bradlow, Wainer, and…

Descriptors: Test Bias, Statistical Analysis, Computation, Bayesian Statistics

Previous Page | Next Page »

Pages: 1 | 2

Dorans, Neil J.	4
Lei, Pui-Wa	2
Miao, Jing	2
Moses, Tim	2
Sinharay, Sandip	2
Strobl, Carolin	2
Wilson, Mark	2
Bartram, Dave	1
Blew, Edwin O.	1
Bovaird, James A.	1
Braun, Henry	1
Brooks, Gordon	1
Cai, Li	1
Carvajal, Jorge	1
Cheng, Weiyi	1
Cho, Sun-Joo	1
Croy, Calvin D.	1
Cui, Ying	1
De Boeck, Paul	1
Debelak, Rudolf	1
DiPerna, James C.	1
Diaz, Emily	1
Dimitrov, Dimiter M.	1
Dorans, Neil	1
Fernández, Daniel	1
More ▼