Publication Date
| In 2026 | 0 |
| Since 2025 | 38 |
| Since 2022 (last 5 years) | 225 |
| Since 2017 (last 10 years) | 570 |
| Since 2007 (last 20 years) | 1377 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 110 |
| Practitioners | 107 |
| Teachers | 46 |
| Administrators | 25 |
| Policymakers | 24 |
| Counselors | 12 |
| Parents | 7 |
| Students | 7 |
| Support Staff | 4 |
| Community | 2 |
Location
| California | 61 |
| Canada | 60 |
| United States | 57 |
| Turkey | 47 |
| Australia | 43 |
| Florida | 34 |
| Germany | 26 |
| Texas | 26 |
| China | 25 |
| Netherlands | 25 |
| Iran | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
| Does not meet standards | 1 |
Runnels, Judith – Language Testing in Asia, 2013
Differential item functioning (DIF) is when a test item favors or hinders a characteristic exhibited by group members of a test-taking population. DIF analyses are statistical procedures used to determine to what extent the content of an item affects the item endorsement of sub-groups of test-takers. If DIF is found for many items on the test, the…
Descriptors: Test Items, Test Bias, Item Response Theory, College Freshmen
Iannone, P.; Simpson, A. – Studies in Higher Education, 2015
Existing research into students' preferences for assessment methods has been developed from a restricted sample: in particular, the voice of students in the 'hard-pure sciences' has rarely been heard. We conducted a mixed method study to explore mathematics students' preferences of assessment methods. In contrast to the message from the general…
Descriptors: Mixed Methods Research, Undergraduate Students, Student Attitudes, Preferences
Taylor, Cora M.; Vehorn, Alison; Noble, Hylan; Weitlauf, Amy S.; Warren, Zachary E. – Journal of Autism and Developmental Disorders, 2014
The goal of the current study was to develop and pilot the utility of two simple internal response bias metrics, over-reporting and under-reporting, in terms of additive clinical value within common screening practices for early detection of autism spectrum disorder risk. Participants were caregivers and children under 36 months of age (n = 145)…
Descriptors: Pilot Projects, Autism, Pervasive Developmental Disorders, Caregivers
Federer, Meghan Rector; Nehm, Ross H.; Pearl, Dennis K. – CBE - Life Sciences Education, 2016
Understanding sources of performance bias in science assessment provides important insights into whether science curricula and/or assessments are valid representations of student abilities. Research investigating assessment bias due to factors such as instrument structure, participant characteristics, and item types are well documented across a…
Descriptors: Gender Differences, Biology, Science Instruction, Case Studies
Beretvas, S. Natasha; Walker, Cindy M. – Educational and Psychological Measurement, 2012
This study extends the multilevel measurement model to handle testlet-based dependencies. A flexible two-level testlet response model (the MMMT-2 model) for dichotomous items is introduced that permits assessment of differential testlet functioning (DTLF). A distinction is made between this study's conceptualization of DTLF and that of…
Descriptors: Test Bias, Simulation, Test Items, Item Response Theory
Paek, Insu – Journal of Educational Measurement, 2012
Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…
Descriptors: Test Bias, Tests, Maximum Likelihood Statistics, Statistical Analysis
Barendse, M. T.; Oort, F. J.; Werner, C. S.; Ligtvoet, R.; Schermelleh-Engel, K. – Structural Equation Modeling: A Multidisciplinary Journal, 2012
Measurement bias is defined as a violation of measurement invariance, which can be investigated through multigroup factor analysis (MGFA), by testing across-group differences in intercepts (uniform bias) and factor loadings (nonuniform bias). Restricted factor analysis (RFA) can also be used to detect measurement bias. To also enable nonuniform…
Descriptors: Factor Analysis, Item Response Theory, Test Bias, Measurement Techniques
Finch, Holmes – Applied Measurement in Education, 2011
Methods of uniform differential item functioning (DIF) detection have been extensively studied in the complete data case. However, less work has been done examining the performance of these methods when missing item responses are present. Research that has been done in this regard appears to indicate that treating missing item responses as…
Descriptors: Test Bias, Data Analysis, Error of Measurement
Dai, Yunyun – Applied Psychological Measurement, 2013
Mixtures of item response theory (IRT) models have been proposed as a technique to explore response patterns in test data related to cognitive strategies, instructional sensitivity, and differential item functioning (DIF). Estimation proves challenging due to difficulties in identification and questions of effect size needed to recover underlying…
Descriptors: Item Response Theory, Test Bias, Computation, Bayesian Statistics
Karami, Hossein – Educational Research and Evaluation, 2013
The search for fairness in language testing is distinct from other areas of educational measurement as the object of measurement, that is, language, is part of the identity of the test takers. So, a host of issues enter the scene when one starts to reflect on how to assess people's language abilities. As the quest for fairness in language testing…
Descriptors: Language Skills, Language Tests, Testing, Culture Fair Tests
Lee, HwaYoung; Beretvas, S. Natasha – Educational and Psychological Measurement, 2014
Conventional differential item functioning (DIF) detection methods (e.g., the Mantel-Haenszel test) can be used to detect DIF only across observed groups, such as gender or ethnicity. However, research has found that DIF is not typically fully explained by an observed variable. True sources of DIF may include unobserved, latent variables, such as…
Descriptors: Item Analysis, Factor Structure, Bayesian Statistics, Goodness of Fit
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Quesen, Sarah – ProQuest LLC, 2016
When studying differential item functioning (DIF) with students with disabilities (SWD) focal groups typically suffer from small sample size, whereas the reference group population is usually large. This makes it possible for a researcher to select a sample from the reference population to be similar to the focal group on the ability scale. Doing…
Descriptors: Test Items, Academic Accommodations (Disabilities), Testing Accommodations, Disabilities
Reardon, Sean F.; Kalogrides, Demetra; Ho, Andrew D. – Stanford Center for Education Policy Analysis, 2017
There is no comprehensive database of U.S. district-level test scores that is comparable across states. We describe and evaluate a method for constructing such a database. First, we estimate linear, reliability-adjusted linking transformations from state test score scales to the scale of the National Assessment of Educational Progress (NAEP). We…
Descriptors: School Districts, Scores, Statistical Distributions, Database Design
Li, Yanju; Brooks, Gordon P.; Johanson, George A. – Educational and Psychological Measurement, 2012
In 2009, DeMars stated that when impact exists there will be Type I error inflation, especially with larger sample sizes and larger discrimination parameters for items. One purpose of this study is to present the patterns of Type I error rates using Mantel-Haenszel (MH) and logistic regression (LR) procedures when the mean ability between the…
Descriptors: Error of Measurement, Test Bias, Test Items, Regression (Statistics)

Peer reviewed
Direct link
