Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 12 |
Descriptor
Evaluation Methods | 17 |
Test Bias | 17 |
Test Items | 8 |
Item Response Theory | 7 |
Scores | 6 |
Models | 5 |
Error of Measurement | 4 |
Simulation | 4 |
Achievement Tests | 3 |
Comparative Analysis | 3 |
Evaluation Research | 3 |
More ▼ |
Source
Journal of Educational… | 17 |
Author
de la Torre, Jimmy | 2 |
Albano, Anthony D. | 1 |
Allen, Nancy L. | 1 |
Ankenmann, Robert D. | 1 |
Anthony W. Raborn | 1 |
Artur Pokropek | 1 |
Bolger, Niall | 1 |
Camilli, Gregory | 1 |
Carmen Köhler | 1 |
Chen, Shu-Ying | 1 |
Chiu, Ting-Wei | 1 |
More ▼ |
Publication Type
Journal Articles | 17 |
Reports - Research | 10 |
Reports - Evaluative | 4 |
Reports - Descriptive | 2 |
Reports - General | 1 |
Education Level
Secondary Education | 2 |
Elementary Secondary Education | 1 |
Grade 4 | 1 |
Higher Education | 1 |
Postsecondary Education | 1 |
Audience
Researchers | 1 |
Location
Ireland | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 2 |
Graduate Record Examinations | 1 |
National Assessment of… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Corinne Huggins-Manley; Anthony W. Raborn; Peggy K. Jones; Ted Myers – Journal of Educational Measurement, 2024
The purpose of this study is to develop a nonparametric DIF method that (a) compares focal groups directly to the composite group that will be used to develop the reported test score scale, and (b) allows practitioners to explore for DIF related to focal groups stemming from multicategorical variables that constitute a small proportion of the…
Descriptors: Nonparametric Statistics, Test Bias, Scores, Statistical Significance
Carmen Köhler; Lale Khorramdel; Artur Pokropek; Johannes Hartig – Journal of Educational Measurement, 2024
For assessment scales applied to different groups (e.g., students from different states; patients in different countries), multigroup differential item functioning (MG-DIF) needs to be evaluated in order to ensure that respondents with the same trait level but from different groups have equal response probabilities on a particular item. The…
Descriptors: Measures (Individuals), Test Bias, Models, Item Response Theory
Assessment of Differential Item Functioning under Cognitive Diagnosis Models: The DINA Model Example
Li, Xiaomin; Wang, Wen-Chung – Journal of Educational Measurement, 2015
The assessment of differential item functioning (DIF) is routinely conducted to ensure test fairness and validity. Although many DIF assessment methods have been developed in the context of classical test theory and item response theory, they are not applicable for cognitive diagnosis models (CDMs), as the underlying latent attributes of CDMs are…
Descriptors: Test Bias, Models, Cognitive Measurement, Evaluation Methods
Dwyer, Andrew C. – Journal of Educational Measurement, 2016
This study examines the effectiveness of three approaches for maintaining equivalent performance standards across test forms with small samples: (1) common-item equating, (2) resetting the standard, and (3) rescaling the standard. Rescaling the standard (i.e., applying common-item equating methodology to standard setting ratings to account for…
Descriptors: Cutting Scores, Equivalency Tests, Test Format, Academic Standards
Hou, Likun; de la Torre, Jimmy; Nandakumar, Ratna – Journal of Educational Measurement, 2014
Analyzing examinees' responses using cognitive diagnostic models (CDMs) has the advantage of providing diagnostic information. To ensure the validity of the results from these models, differential item functioning (DIF) in CDMs needs to be investigated. In this article, the Wald test is proposed to examine DIF in the context of CDMs. This study…
Descriptors: Test Bias, Models, Simulation, Error Patterns
Sachse, Karoline A.; Roppelt, Alexander; Haag, Nicole – Journal of Educational Measurement, 2016
Trend estimation in international comparative large-scale assessments relies on measurement invariance between countries. However, cross-national differential item functioning (DIF) has been repeatedly documented. We ran a simulation study using national item parameters, which required trends to be computed separately for each country, to compare…
Descriptors: Comparative Analysis, Measurement, Test Bias, Simulation
Albano, Anthony D. – Journal of Educational Measurement, 2013
In many testing programs it is assumed that the context or position in which an item is administered does not have a differential effect on examinee responses to the item. Violations of this assumption may bias item response theory estimates of item and person parameters. This study examines the potentially biasing effects of item position. A…
Descriptors: Test Items, Item Response Theory, Test Format, Questioning Techniques
Penfield, Randall D. – Journal of Educational Measurement, 2010
In this article, I address two competing conceptions of differential item functioning (DIF) in polytomously scored items. The first conception, referred to as net DIF, concerns between-group differences in the conditional expected value of the polytomous response variable. The second conception, referred to as global DIF, concerns the conditional…
Descriptors: Test Bias, Test Items, Evaluation Methods, Item Response Theory
Camilli, Gregory; Prowker, Adam; Dossey, John A.; Lindquist, Mary M.; Chiu, Ting-Wei; Vargas, Sadako; de la Torre, Jimmy – Journal of Educational Measurement, 2008
A new method for analyzing differential item functioning is proposed to investigate the relative strengths and weaknesses of multiple groups of examinees. Accordingly, the notion of a conditional measure of difference between two groups (Reference and Focal) is generalized to a conditional variance. The objective of this article is to present and…
Descriptors: Test Bias, National Competency Tests, Grade 4, Difficulty Level

Van Der Flier, Henk; And Others – Journal of Educational Measurement, 1984
Two strategies for assessing item bias are discussed: methods comparing item difficulties unconditional on ability and methods comparing probabilities of response conditional on ability. Results suggest that the iterative logit method is an improvement on the noniterative one and is efficient in detecting biased and unbiased items. (Author/DWH)
Descriptors: Algorithms, Evaluation Methods, Item Analysis, Scores
Oshima, T. C.; Raju, Nambury S.; Nanda, Alice O. – Journal of Educational Measurement, 2006
A new item parameter replication method is proposed for assessing the statistical significance of the noncompensatory differential item functioning (NCDIF) index associated with the differential functioning of items and tests framework. In this new method, a cutoff score for each item is determined by obtaining a (1-alpha ) percentile rank score…
Descriptors: Evaluation Methods, Statistical Distributions, Statistical Significance, Test Bias
Allen, Nancy L.; Holland, Paul W.; Thayer, Dorothy T. – Journal of Educational Measurement, 2005
Allowing students to choose the question(s) that they will answer from among several possible alternatives is often viewed as a mechanism for increasing fairness in certain types of assessments. The fairness of optional topic choice is not a universally accepted fact, however, and various studies have been done to assess this question. We examine…
Descriptors: Test Theory, Test Items, Student Evaluation, Evaluation Methods

Schmidt, William H. – Journal of Educational Measurement, 1983
A conception of invalidity as bias is related to content validity for standardized achievement tests. A method of estimating content bias for each of three content domains (a priori, curricular, and instructional) based on the specification of a content taxonomy is also proposed. (Author/CM)
Descriptors: Achievement Tests, Content Analysis, Evaluation Methods, Instruction
Monahan, Patrick O.; Ankenmann, Robert D. – Journal of Educational Measurement, 2005
Empirical studies demonstrated Type-I error (TIE) inflation (especially for highly discriminating easy items) of the Mantel-Haenszel chi-square test for differential item functioning (DIF), when data conformed to item response theory (IRT) models more complex than Rasch, and when IRT proficiency distributions differed only in means. However, no…
Descriptors: Sample Size, Item Response Theory, Test Items, Test Bias
Lei, Pui-Wa; Chen, Shu-Ying; Yu, Lan – Journal of Educational Measurement, 2006
Mantel-Haenszel and SIBTEST, which have known difficulty in detecting non-unidirectional differential item functioning (DIF), have been adapted with some success for computerized adaptive testing (CAT). This study adapts logistic regression (LR) and the item-response-theory-likelihood-ratio test (IRT-LRT), capable of detecting both unidirectional…
Descriptors: Evaluation Methods, Test Bias, Computer Assisted Testing, Multiple Regression Analysis
Previous Page | Next Page »
Pages: 1 | 2