ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	18

Descriptor

Test Bias	20
Test Items	15
Simulation	7
Item Response Theory	6
Evaluation Methods	4
Measurement	4
Measurement Techniques	4
Science Tests	3
Statistical Analysis	3
Statistics	3
Accountability	2
Classification	2
Comparative Analysis	2
Computation	2
Computer Software	2
Difficulty Level	2
Equated Scores	2
Guidelines	2
Item Analysis	2
Models	2
Multiple Choice Tests	2
Psychometrics	2
Scores	2
Test Validity	2
Validity	2
More ▼

Source

Journal of Educational…	4
Applied Measurement in…	3
Applied Psychological…	3
Educational and Psychological…	3
Educational Measurement:…	2
International Journal of…	2
Alberta Journal of…	1
Journal of Research in…	1
Measurement in Physical…	1

Author

Penfield, Randall D.	20
Myers, Nicholas D.	3
Gattamorta, Karina A.	2
Lee, Okhee	2
Ahn, Soyeon	1
Algina, James	1
Alvarez, Karina	1
Childs, Ruth A.	1
Feltz, Deborah L.	1
Gattamorta, Karina	1
Guler, Nese	1
Huggins, Anne C.	1
Huggins-Manley, Anne Corinne	1
Jin, Ying	1
Qiu, Yuxi	1
Wolfe, Edward W.	1
More ▼

Publication Type

Journal Articles	20
Reports - Evaluative	9
Reports - Research	9
Information Analyses	1
Reports - Descriptive	1

Education Level

Elementary Secondary Education	2
Adult Education	1
Elementary Education	1
Grade 4	1
Grade 5	1
Grade 8	1

Audience

Location

Canada

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

Exploring a Source of Uneven Score Equity across the Test Score Range

Peer reviewed

Direct link

Huggins-Manley, Anne Corinne; Qiu, Yuxi; Penfield, Randall D. – International Journal of Testing, 2018

Score equity assessment (SEA) refers to an examination of population invariance of equating across two or more subpopulations of test examinees. Previous SEA studies have shown that score equity may be present for examinees scoring at particular test score ranges but absent for examinees scoring at other score ranges. No studies to date have…

Descriptors: Equated Scores, Test Bias, Test Items, Difficulty Level

An NCME Instructional Module on Population Invariance in Linking and Equating

Peer reviewed

Direct link

Huggins, Anne C.; Penfield, Randall D. – Educational Measurement: Issues and Practice, 2012

A goal for any linking or equating of two or more tests is that the linking function be invariant to the population used in conducting the linking or equating. Violations of population invariance in linking and equating jeopardize the fairness and validity of test scores, and pose particular problems for test-based accountability programs that…

Descriptors: Equated Scores, Tests, Test Bias, Validity

How Are the Form and Magnitude of DIF Effects in Multiple-Choice Items Determined by Distractor-Level Invariance Effects?

Peer reviewed

Direct link

Penfield, Randall D. – Educational and Psychological Measurement, 2011

This article explores how the magnitude and form of differential item functioning (DIF) effects in multiple-choice items are determined by the underlying differential distractor functioning (DDF) effects, as modeled under the nominal response model. The results of a numerical investigation indicated that (a) the presence of one or more nonzero DDF…

Descriptors: Test Bias, Multiple Choice Tests, Test Items, Models

A Comparison of Uniform DIF Effect Size Estimators under the MIMIC and Rasch Models

Peer reviewed

Direct link

Jin, Ying; Myers, Nicholas D.; Ahn, Soyeon; Penfield, Randall D. – Educational and Psychological Measurement, 2013

The Rasch model, a member of a larger group of models within item response theory, is widely used in empirical studies. Detection of uniform differential item functioning (DIF) within the Rasch model typically employs null hypothesis testing with a concomitant consideration of effect size (e.g., signed area [SA]). Parametric equivalence between…

Descriptors: Test Bias, Effect Size, Item Response Theory, Comparative Analysis

Explaining Crossing DIF in Polytomous Items Using Differential Step Functioning Effects

Peer reviewed

Direct link

Penfield, Randall D. – Applied Psychological Measurement, 2010

Crossing, or intersecting, differential item functioning (DIF) is a form of nonuniform DIF that exists when the sign of the between-group difference in expected item performance changes across the latent trait continuum. The presence of crossing DIF presents a problem for many statistics developed for evaluating DIF because positive and negative…

Descriptors: Test Bias, Test Items, Statistics, Test Theory

Modeling DIF Effects Using Distractor-Level Invariance Effects: Implications for Understanding the Causes of DIF

Peer reviewed

Direct link

Penfield, Randall D. – Applied Psychological Measurement, 2010

In 2008, Penfield showed that measurement invariance across all response options of a multiple-choice item (correct option and the "J" distractors) can be modeled using a nominal response model that included a differential distractor functioning (DDF) effect for each of the "J" distractors. This article extends this concept to consider how the…

Descriptors: Test Bias, Test Items, Models, Multiple Choice Tests

Distinguishing between Net and Global DIF in Polytomous Items

Peer reviewed

Direct link

Penfield, Randall D. – Journal of Educational Measurement, 2010

In this article, I address two competing conceptions of differential item functioning (DIF) in polytomously scored items. The first conception, referred to as net DIF, concerns between-group differences in the conditional expected value of the polytomous response variable. The second conception, referred to as global DIF, concerns the conditional…

Descriptors: Test Bias, Test Items, Evaluation Methods, Item Response Theory

A Comparison of Adjacent Categories and Cumulative Differential Step Functioning Effect Estimators

Peer reviewed

Direct link

Gattamorta, Karina A.; Penfield, Randall D. – Applied Measurement in Education, 2012

The study of measurement invariance in polytomous items that targets individual score levels is known as differential step functioning (DSF). The analysis of DSF requires the creation of a set of dichotomizations of the item response variable. There are two primary approaches for creating the set of dichotomizations to conduct a DSF analysis: the…

Descriptors: Measurement, Item Response Theory, Test Bias, Test Items

Modeling Item-Level and Step-Level Invariance Effects in Polytomous Items Using the Partial Credit Model

Peer reviewed

Direct link

Gattamorta, Karina A.; Penfield, Randall D.; Myers, Nicholas D. – International Journal of Testing, 2012

Measurement invariance is a common consideration in the evaluation of the validity and fairness of test scores when the tested population contains distinct groups of examinees, such as examinees receiving different forms of a translated test. Measurement invariance in polytomous items has traditionally been evaluated at the item-level,…

Descriptors: Foreign Countries, Psychometrics, Test Bias, Test Items

An NCME Instructional Module on Using Differential Step Functioning to Refine the Analysis of DIF in Polytomous Items

Peer reviewed

Direct link

Penfield, Randall D.; Gattamorta, Karina; Childs, Ruth A. – Educational Measurement: Issues and Practice, 2009

Traditional methods for examining differential item functioning (DIF) in polytomously scored test items yield a single item-level index of DIF and thus provide no information concerning which score levels are implicated in the DIF effect. To address this limitation of DIF methodology, the framework of differential step functioning (DSF) has…

Descriptors: Test Bias, Test Items, Evaluation Methods, Scores

A Comparison of the Logistic Regression and Contingency Table Methods for Simultaneous Detection of Uniform and Nonuniform DIF

Peer reviewed

Direct link

Guler, Nese; Penfield, Randall D. – Journal of Educational Measurement, 2009

In this study, we investigate the logistic regression (LR), Mantel-Haenszel (MH), and Breslow-Day (BD) procedures for the simultaneous detection of both uniform and nonuniform differential item functioning (DIF). A simulation study was used to assess and compare the Type I error rate and power of a combined decision rule (CDR), which assesses DIF…

Descriptors: Test Bias, Simulation, Test Items, Measurement

Test-Based Accountability: Potential Benefits and Pitfalls of Science Assessment with Student Diversity

Peer reviewed

Direct link

Penfield, Randall D.; Lee, Okhee – Journal of Research in Science Teaching, 2010

Recent test-based accountability policy in the U.S. has involved annually assessing all students in core subjects and holding schools accountable for adequate progress of all students by implementing sanctions when adequate progress is not met. Despite its potential benefits, basing educational policy on assessments developed for a student…

Descriptors: Science Tests, Student Diversity, Accountability, Minority Groups

Using a Taxonomy of Differential Step Functioning to Improve the Interpretation of DIF in Polytomous Items: An Illustration

Peer reviewed

Direct link

Penfield, Randall D.; Alvarez, Karina; Lee, Okhee – Applied Measurement in Education, 2009

The assessment of differential item functioning (DIF) in polytomous items addresses between-group differences in measurement properties at the item level, but typically does not inform which score levels may be involved in the DIF effect. The framework of differential step functioning (DSF) addresses this issue by examining between-group…

Descriptors: Test Bias, Classification, Test Items, Criteria

Assessing Differential Step Functioning in Polytomous Items Using a Common Odds Ratio Estimator

Peer reviewed

Direct link

Penfield, Randall D. – Journal of Educational Measurement, 2007

Many statistics used in the assessment of differential item functioning (DIF) in polytomous items yield a single item-level index of measurement invariance that collapses information across all response options of the polytomous item. Utilizing a single item-level index of DIF can, however, be misleading if the magnitude or direction of the DIF…

Descriptors: Simulation, Test Bias, Statistics, Test Items

An Approach for Categorizing DIF in Polytomous Items

Peer reviewed

Direct link

Penfield, Randall D. – Applied Measurement in Education, 2007

A widely used approach for categorizing the level of differential item functioning (DIF) in dichotomous items is the scheme proposed by Educational Testing Service (ETS) based on a transformation of the Mantel-Haeszel common odds ratio. In this article two classification schemes for DIF in polytomous items (referred to as the P1 and P2 schemes)…

Descriptors: Simulation, Educational Testing, Test Bias, Evaluation Methods

Previous Page | Next Page »

Pages: 1 | 2