Publication Date
In 2025 | 2 |
Since 2024 | 18 |
Since 2021 (last 5 years) | 56 |
Since 2016 (last 10 years) | 156 |
Since 2006 (last 20 years) | 385 |
Descriptor
Item Response Theory | 428 |
Test Bias | 428 |
Test Items | 198 |
Foreign Countries | 113 |
Simulation | 78 |
Statistical Analysis | 77 |
Models | 71 |
Scores | 70 |
Test Validity | 65 |
Comparative Analysis | 64 |
Psychometrics | 60 |
More ▼ |
Source
Author
Wang, Wen-Chung | 9 |
Oshima, T. C. | 7 |
Woods, Carol M. | 7 |
Dorans, Neil J. | 6 |
Penfield, Randall D. | 6 |
Finch, W. Holmes | 5 |
Zumbo, Bruno D. | 5 |
Cohen, Allan S. | 4 |
DeMars, Christine E. | 4 |
Engelhard, George, Jr. | 4 |
Ercikan, Kadriye | 4 |
More ▼ |
Publication Type
Education Level
Secondary Education | 65 |
Elementary Education | 53 |
Higher Education | 53 |
Postsecondary Education | 42 |
Middle Schools | 35 |
Junior High Schools | 32 |
Grade 8 | 22 |
Grade 4 | 21 |
High Schools | 21 |
Intermediate Grades | 21 |
Grade 5 | 18 |
More ▼ |
Audience
Researchers | 4 |
Practitioners | 2 |
Teachers | 1 |
Location
Turkey | 19 |
Germany | 9 |
Taiwan | 9 |
Florida | 7 |
Iran | 7 |
Netherlands | 7 |
Singapore | 6 |
United States | 6 |
Australia | 5 |
Belgium | 5 |
California | 5 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Zeyuan Jing – ProQuest LLC, 2023
This dissertation presents a comprehensive review of the evolution of DIF analysis within educational measurement from the 1980s to the present. The review elucidates the concept of DIF, particularly emphasizing the crucial role of grouping for exhibiting DIF. Then, the dissertation introduces an innovative modification to the newly developed…
Descriptors: Item Response Theory, Algorithms, Measurement, Test Bias
Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2024
Rapid guessing (RG) is a form of non-effortful responding that is characterized by short response latencies. This construct-irrelevant behavior has been shown in previous research to bias inferences concerning measurement properties and scores. To mitigate these deleterious effects, a number of response time threshold scoring procedures have been…
Descriptors: Reaction Time, Scores, Item Response Theory, Guessing (Tests)
William C. M. Belzak; Daniel J. Bauer – Journal of Educational and Behavioral Statistics, 2024
Testing for differential item functioning (DIF) has undergone rapid statistical developments recently. Moderated nonlinear factor analysis (MNLFA) allows for simultaneous testing of DIF among multiple categorical and continuous covariates (e.g., sex, age, ethnicity, etc.), and regularization has shown promising results for identifying DIF among…
Descriptors: Test Bias, Algorithms, Factor Analysis, Error of Measurement
Sooyong Lee; Suhwa Han; Seung W. Choi – Journal of Educational Measurement, 2024
Research has shown that multiple-indicator multiple-cause (MIMIC) models can result in inflated Type I error rates in detecting differential item functioning (DIF) when the assumption of equal latent variance is violated. This study explains how the violation of the equal variance assumption adversely impacts the detection of nonuniform DIF and…
Descriptors: Factor Analysis, Bayesian Statistics, Test Bias, Item Response Theory
Farshad Effatpanah; Purya Baghaei; Hamdollah Ravand; Olga Kunina-Habenicht – International Journal of Testing, 2025
This study applied the Mixed Rasch Model (MRM) to the listening comprehension section of the International English Language Testing System (IELTS) to detect latent class differential item functioning (DIF) by exploring multiple profiles of second/foreign language listeners. Item responses of 462 examinees to an IELTS listening test were subjected…
Descriptors: Item Response Theory, Second Language Learning, Listening Comprehension, English (Second Language)
Hwanggyu Lim; Danqi Zhu; Edison M. Choe; Kyung T. Han – Journal of Educational Measurement, 2024
This study presents a generalized version of the residual differential item functioning (RDIF) detection framework in item response theory, named GRDIF, to analyze differential item functioning (DIF) in multiple groups. The GRDIF framework retains the advantages of the original RDIF framework, such as computational efficiency and ease of…
Descriptors: Item Response Theory, Test Bias, Test Reliability, Test Construction
James D. Weese; Ronna C. Turner; Allison Ames; Xinya Liang; Brandon Crawford – Journal of Experimental Education, 2024
In this study a standardized effect size was created for use with the SIBTEST procedure. Using this standardized effect size, a single set of heuristics was developed that are appropriate for data fitting different item response models (e.g., 2-parameter logistic, 3-parameter logistic). The standardized effect size rescales the raw beta-uni value…
Descriptors: Test Bias, Test Items, Item Response Theory, Effect Size
Finch, W. Holmes – Educational and Psychological Measurement, 2023
Psychometricians have devoted much research and attention to categorical item responses, leading to the development and widespread use of item response theory for the estimation of model parameters and identification of items that do not perform in the same way for examinees from different population subgroups (e.g., differential item functioning…
Descriptors: Test Bias, Item Response Theory, Computation, Methods
Chalmers, R. Philip – Journal of Educational Measurement, 2023
Several marginal effect size (ES) statistics suitable for quantifying the magnitude of differential item functioning (DIF) have been proposed in the area of item response theory; for instance, the Differential Functioning of Items and Tests (DFIT) statistics, signed and unsigned item difference in the sample statistics (SIDS, UIDS, NSIDS, and…
Descriptors: Test Bias, Item Response Theory, Definitions, Monte Carlo Methods
Martijn Schoenmakers; Jesper Tijmstra; Jeroen Vermunt; Maria Bolsinova – Educational and Psychological Measurement, 2024
Extreme response style (ERS), the tendency of participants to select extreme item categories regardless of the item content, has frequently been found to decrease the validity of Likert-type questionnaire results. For this reason, various item response theory (IRT) models have been proposed to model ERS and correct for it. Comparisons of these…
Descriptors: Item Response Theory, Response Style (Tests), Models, Likert Scales
Hung-Yu Huang – Educational and Psychological Measurement, 2025
The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent…
Descriptors: Response Style (Tests), Psychological Characteristics, Item Response Theory, Test Reliability
Weese, James D.; Turner, Ronna C.; Liang, Xinya; Ames, Allison; Crawford, Brandon – Educational and Psychological Measurement, 2023
A study was conducted to implement the use of a standardized effect size and corresponding classification guidelines for polytomous data with the POLYSIBTEST procedure and compare those guidelines with prior recommendations. Two simulation studies were included. The first identifies new unstandardized test heuristics for classifying moderate and…
Descriptors: Effect Size, Classification, Guidelines, Statistical Analysis
Tien-Ling Hu; Dubravka Svetina Valdivia – Research in Higher Education, 2024
Undergraduate research, recognized as one of the High-Impact Practices (HIPs), has demonstrated a positive association with diverse student learning outcomes. Understanding the pivotal quality factors essential for its efficacy is important for enhancing student success. This study evaluates the psychometric properties of survey items employed to…
Descriptors: Undergraduate Students, Student Research, Student Experience, Psychometrics
Chalmers, R. Philip; Zheng, Guoguo – Applied Measurement in Education, 2023
This article presents generalizations of SIBTEST and crossing-SIBTEST statistics for differential item functioning (DIF) investigations involving more than two groups. After reviewing the original two-group setup for these statistics, a set of multigroup generalizations that support contrast matrices for joint tests of DIF are presented. To…
Descriptors: Test Bias, Test Items, Item Response Theory, Error of Measurement
Wang, Weimeng; Liu, Yang; Liu, Hongyun – Journal of Educational and Behavioral Statistics, 2022
Differential item functioning (DIF) occurs when the probability of endorsing an item differs across groups for individuals with the same latent trait level. The presence of DIF items may jeopardize the validity of an instrument; therefore, it is crucial to identify DIF items in routine operations of educational assessment. While DIF detection…
Descriptors: Test Bias, Test Items, Equated Scores, Regression (Statistics)