ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	13

Descriptor

Comparative Analysis	14
Test Bias	14
Test Items	10
Difficulty Level	3
Effect Size	3
Foreign Countries	3
Item Response Theory	3
College Entrance Examinations	2
Educational Testing	2
Grade 8	2
International Assessment	2
Item Analysis	2
Mathematics Tests	2
Probability	2
Scores	2
Statistical Analysis	2
Tests	2
Accuracy	1
African Americans	1
Algebra	1
Behavior Problems	1
Black Students	1
Case Studies	1
Citizenship	1
Classification	1
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	14
Reports - Research	13
Information Analyses	1
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Elementary Secondary Education	2
Grade 5	2
Grade 8	2
Higher Education	2
Postsecondary Education	1
Secondary Education	1

Audience

Location

Canada	2
California	1
Florida	1
Massachusetts	1
New York	1
North Carolina	1
Oklahoma	1
Texas	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	3
Trends in International…	2
Graduate Record Examinations	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Standard Errors for National Trends in International Large-Scale Assessments in the Case of Cross-National Differential Item Functioning

Peer reviewed

Direct link

Sachse, Karoline A.; Haag, Nicole – Applied Measurement in Education, 2017

Standard errors computed according to the operational practices of international large-scale assessment studies such as the Programme for International Student Assessment's (PISA) or the Trends in International Mathematics and Science Study (TIMSS) may be biased when cross-national differential item functioning (DIF) and item parameter drift are…

Descriptors: Error of Measurement, Test Bias, International Assessment, Computation

Detection of Differential Item Functioning for More than Two Groups: A Monte Carlo Comparison of Methods

Peer reviewed

Direct link

Finch, W. Holmes – Applied Measurement in Education, 2016

Differential item functioning (DIF) assessment is a crucial component in test construction, serving as the primary way in which instrument developers ensure that measures perform in the same way for multiple groups within the population. When such is not the case, scores may not accurately reflect the trait of interest for all individuals in the…

Descriptors: Test Bias, Monte Carlo Methods, Comparative Analysis, Population Groups

Analyzing Fairness among Linguistic Minority Populations Using a Latent Class Differential Item Functioning Approach

Peer reviewed

Direct link

Oliveri, Maria Elena; Ercikan, Kadriye; Lyons-Thomas, Juliette; Holtzman, Steven – Applied Measurement in Education, 2016

Differential item functioning (DIF) analyses have been used as the primary method in large-scale assessments to examine fairness for subgroups. Currently, DIF analyses are conducted utilizing manifest methods using observed characteristics (gender and race/ethnicity) for grouping examinees. Homogeneity of item responses is assumed denoting that…

Descriptors: Test Bias, Language Minorities, Effect Size, Foreign Countries

The Comparability of Scores from Different Digital Devices: A Literature Review and Synthesis with Recommendations for Practice

Peer reviewed

Direct link

Dadey, Nathan; Lyons, Susan; DePascale, Charles – Applied Measurement in Education, 2018

Evidence of comparability is generally needed whenever there are variations in the conditions of an assessment administration, including variations introduced by the administration of an assessment on multiple digital devices (e.g., tablet, laptop, desktop). This article is meant to provide a comprehensive examination of issues relevant to the…

Descriptors: Evaluation Methods, Computer Assisted Testing, Educational Technology, Technology Uses in Education

An Empirical Comparison of DDF Detection Methods for Understanding the Causes of DIF in Multiple-Choice Items

Peer reviewed

Direct link

Suh, Youngsuk; Talley, Anna E. – Applied Measurement in Education, 2015

This study compared and illustrated four differential distractor functioning (DDF) detection methods for analyzing multiple-choice items. The log-linear approach, two item response theory-model-based approaches with likelihood ratio tests, and the odds ratio approach were compared to examine the congruence among the four DDF detection methods.…

Descriptors: Test Bias, Multiple Choice Tests, Test Items, Methods

An Exploratory Analysis of Differential Item Functioning and Its Possible Sources in a Higher Education Admissions Context

Peer reviewed

Direct link

Oliveri, Maria Elena; Lawless, Rene; Robin, Frederic; Bridgeman, Brent – Applied Measurement in Education, 2018

We analyzed a pool of items from an admissions test for differential item functioning (DIF) for groups based on age, socioeconomic status, citizenship, or English language status using Mantel-Haenszel and item response theory. DIF items were systematically examined to identify its possible sources by item type, content, and wording. DIF was…

Descriptors: Test Bias, Comparative Analysis, Item Banks, Item Response Theory

Centering, Scale Indeterminacy, and Differential Item Functioning Detection in Hierarchical Generalized Linear and Generalized Linear Mixed Models

Peer reviewed

Direct link

Cheong, Yuk Fai; Kamata, Akihito – Applied Measurement in Education, 2013

In this article, we discuss and illustrate two centering and anchoring options available in differential item functioning (DIF) detection studies based on the hierarchical generalized linear and generalized linear mixed modeling frameworks. We compared and contrasted the assumptions of the two options, and examined the properties of their DIF…

Descriptors: Test Bias, Hierarchical Linear Modeling, Comparative Analysis, Test Items

An Analytic Comparison of Effect Sizes for Differential Item Functioning

Peer reviewed

Direct link

Demars, Christine E. – Applied Measurement in Education, 2011

Three types of effects sizes for DIF are described in this exposition: log of the odds-ratio (differences in log-odds), differences in probability-correct, and proportion of variance accounted for. Using these indices involves conceptualizing the degree of DIF in different ways. This integrative review discusses how these measures are impacted in…

Descriptors: Effect Size, Test Bias, Probability, Difficulty Level

Do Different Approaches to Examining Construct Comparability in Multilanguage Assessments Lead to Similar Conclusions?

Peer reviewed

Direct link

Oliveri, Maria E.; Ercikan, Kadriye – Applied Measurement in Education, 2011

In this study, we examine the degree of construct comparability and possible sources of incomparability of the English and French versions of the Programme for International Student Assessment (PISA) 2003 problem-solving measure administered in Canada. Several approaches were used to examine construct comparability at the test- (examination of…

Descriptors: Foreign Countries, English, French, Tests

A Comparison of Adjacent Categories and Cumulative Differential Step Functioning Effect Estimators

Peer reviewed

Direct link

Gattamorta, Karina A.; Penfield, Randall D. – Applied Measurement in Education, 2012

The study of measurement invariance in polytomous items that targets individual score levels is known as differential step functioning (DSF). The analysis of DSF requires the creation of a set of dichotomizations of the item response variable. There are two primary approaches for creating the set of dichotomizations to conduct a DSF analysis: the…

Descriptors: Measurement, Item Response Theory, Test Bias, Test Items

Evaluating Score Equity Assessment for State NAEP

Peer reviewed

Direct link

Wells, Craig S.; Baldwin, Su; Hambleton, Ronald K.; Sireci, Stephen G.; Karatonis, Ana; Jirka, Stephen – Applied Measurement in Education, 2009

Score equity assessment is an important analysis to ensure inferences drawn from test scores are comparable across subgroups of examinees. The purpose of the present evaluation was to assess the extent to which the Grade 8 NAEP Math and Reading assessments for 2005 were equivalent across selected states. More specifically, the present study…

Descriptors: National Competency Tests, Test Bias, Equated Scores, Grade 8

Identifying Possible Sources of Differential Functioning Using Differential Bundle Functioning with Polytomously Scored Data

Peer reviewed

Direct link

McCarty, F. A.; Oshima, T. C.; Raju, Nambury S. – Applied Measurement in Education, 2007

Oshima, Raju, Flowers, and Slinde (1998) described procedures for identifying sources of differential functioning for dichotomous data using differential bundle functioning (DBF) derived from the differential functioning of items and test (DFIT) framework (Raju, van der Linden, & Fleer, 1995). The purpose of this study was to extend the…

Descriptors: Rating Scales, Test Bias, Scoring, Test Items

A Comprehensive Framework for Evaluating Hypotheses about Cultural Bias in Educational Testing

Peer reviewed

Direct link

Banks, Kathleen – Applied Measurement in Education, 2006

The purpose of this article is to present a working definition of the term "culture," as well as to describe and demonstrate a comprehensive framework for evaluating hypotheses about cultural bias in educational testing. The framework is demonstrated using 5th-grade reading and language arts data from the Terra Nova test (CTB/McGraw-Hill, 1999).…

Descriptors: Test Bias, Educational Testing, Test Items, Hispanic Americans

Does the Use of Test Assembly Procedures Proposed in Legislation Make Any Difference in Test Properties and in the Test Performance of Black and White Test Takers?

Peer reviewed

Marco, Gary L. – Applied Measurement in Education, 1988

Four simulated mathematical and verbal test forms were produced by test assembly procedures proposed in legislative bills in California and New York in 1986 to minimize differences between majority and minority scores. Item response theory analyses of data for about 22,000 black and 28,000 White high-school students were conducted. (SLD)

Descriptors: Black Students, College Entrance Examinations, Comparative Analysis, Culture Fair Tests

Ercikan, Kadriye	2
Oliveri, Maria Elena	2
Baldwin, Su	1
Banks, Kathleen	1
Bridgeman, Brent	1
Cheong, Yuk Fai	1
Dadey, Nathan	1
DePascale, Charles	1
Demars, Christine E.	1
Finch, W. Holmes	1
Gattamorta, Karina A.	1
Haag, Nicole	1
Hambleton, Ronald K.	1
Holtzman, Steven	1
Jirka, Stephen	1
Kamata, Akihito	1
Karatonis, Ana	1
Lawless, Rene	1
Lyons, Susan	1
Lyons-Thomas, Juliette	1
Marco, Gary L.	1
McCarty, F. A.	1
Oliveri, Maria E.	1
Oshima, T. C.	1
Penfield, Randall D.	1
More ▼