ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	7
Since 2016 (last 10 years)	11
Since 2006 (last 20 years)	29

Descriptor

Test Bias	33
Test Items	15
Item Response Theory	13
Statistical Analysis	12
Computation	11
Simulation	8
Bayesian Statistics	7
Regression (Statistics)	6
Scores	6
Comparative Analysis	5
Foreign Countries	5
Correlation	4
Maximum Likelihood Statistics	4
Models	4
Probability	4
Computer Assisted Testing	3
Educational Assessment	3
Goodness of Fit	3
Hypothesis Testing	3
Measurement Techniques	3
Racial Differences	3
Tests	3
Accuracy	2
Achievement Tests	2
Adaptive Testing	2
More ▼

Source

Journal of Educational and…

Publication Type

Journal Articles	33
Reports - Research	18
Reports - Descriptive	7
Reports - Evaluative	6
Opinion Papers	3
Tests/Questionnaires	1

Education Level

Higher Education	3
Secondary Education	3
Elementary Education	2
Grade 5	2
Grade 9	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1

Audience

Researchers

Location

Belgium	1
Brazil	1
Canada	1
Netherlands	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	3
Armed Services Vocational…	1
Center for Epidemiologic…	1
Law School Admission Test	1
Pre Professional Skills Tests	1
SAT (College Admission Test)	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 33 results Save | Export

Using Regularization to Identify Measurement Bias across Multiple Background Characteristics: A Penalized Expectation-Maximization Algorithm

Peer reviewed

Direct link

William C. M. Belzak; Daniel J. Bauer – Journal of Educational and Behavioral Statistics, 2024

Testing for differential item functioning (DIF) has undergone rapid statistical developments recently. Moderated nonlinear factor analysis (MNLFA) allows for simultaneous testing of DIF among multiple categorical and continuous covariates (e.g., sex, age, ethnicity, etc.), and regularization has shown promising results for identifying DIF among…

Descriptors: Test Bias, Algorithms, Factor Analysis, Error of Measurement

Artificial Intelligence and Educational Measurement: Opportunities and Threats

Peer reviewed

Direct link

Andrew D. Ho – Journal of Educational and Behavioral Statistics, 2024

I review opportunities and threats that widely accessible Artificial Intelligence (AI)-powered services present for educational statistics and measurement. Algorithmic and computational advances continue to improve approaches to item generation, scale maintenance, test security, test scoring, and score reporting. Predictable misuses of AI for…

Descriptors: Artificial Intelligence, Measurement, Educational Assessment, Technology Uses in Education

Testing Differential Item Functioning without Predefined Anchor Items Using Robust Regression

Peer reviewed

Direct link

Wang, Weimeng; Liu, Yang; Liu, Hongyun – Journal of Educational and Behavioral Statistics, 2022

Differential item functioning (DIF) occurs when the probability of endorsing an item differs across groups for individuals with the same latent trait level. The presence of DIF items may jeopardize the validity of an instrument; therefore, it is crucial to identify DIF items in routine operations of educational assessment. While DIF detection…

Descriptors: Test Bias, Test Items, Equated Scores, Regression (Statistics)

Using Item Scores and Distractors to Detect Item Compromise and Preknowledge

Peer reviewed

Direct link

Gorney, Kylie; Wollack, James A.; Sinharay, Sandip; Eckerly, Carol – Journal of Educational and Behavioral Statistics, 2023

Any time examinees have had access to items and/or answers prior to taking a test, the fairness of the test and validity of test score interpretations are threatened. Therefore, there is a high demand for procedures to detect both compromised items (CI) and examinees with preknowledge (EWP). In this article, we develop a procedure that uses item…

Descriptors: Scores, Test Validity, Test Items, Prior Learning

Ordinal Approaches to Decomposing Between-Group Test Score Disparities

Peer reviewed

Direct link

Quinn, David M.; Ho, Andrew D. – Journal of Educational and Behavioral Statistics, 2021

The estimation of test score "gaps" and gap trends plays an important role in monitoring educational inequality. Researchers decompose gaps and gap changes into within- and between-school portions to generate evidence on the role schools play in shaping these inequalities. However, existing decomposition methods assume an equal-interval…

Descriptors: Scores, Tests, Achievement Gap, Equal Education

Estimating Difference-Score Reliability in Pretest-Posttest Settings

Peer reviewed

Direct link

Gu, Zhengguo; Emons, Wilco H. M.; Sijtsma, Klaas – Journal of Educational and Behavioral Statistics, 2021

Clinical, medical, and health psychologists use difference scores obtained from pretest--posttest designs employing the same test to assess intraindividual change possibly caused by an intervention addressing, for example, anxiety, depression, eating disorder, or addiction. Reliability of difference scores is important for interpreting observed…

Descriptors: Test Reliability, Scores, Pretests Posttests, Computation

Full Information Optimal Scoring

Peer reviewed

Direct link

Ramsay, James; Wiberg, Marie; Li, Juan – Journal of Educational and Behavioral Statistics, 2020

Ramsay and Wiberg used a new version of item response theory that represents test performance over nonnegative closed intervals such as [0, 100] or [0, n] and demonstrated that optimal scoring of binary test data yielded substantial improvements in point-wise root-mean-squared error and bias over number right or sum scoring. We extend these…

Descriptors: Scoring, Weighted Scores, Item Response Theory, Intervals

Mean Comparisons of Many Groups in the Presence of DIF: An Evaluation of Linking and Concurrent Scaling Approaches

Peer reviewed

Direct link

Robitzsch, Alexander; Lüdtke, Oliver – Journal of Educational and Behavioral Statistics, 2022

One of the primary goals of international large-scale assessments in education is the comparison of country means in student achievement. This article introduces a framework for discussing differential item functioning (DIF) for such mean comparisons. We compare three different linking methods: concurrent scaling based on full invariance,…

Descriptors: Test Bias, International Assessment, Scaling, Comparative Analysis

Statistical Equivalence Testing Approaches for Mantel-Haenszel DIF Analysis

Peer reviewed

Direct link

Casabianca, Jodi M.; Lewis, Charles – Journal of Educational and Behavioral Statistics, 2018

The null hypothesis test used in differential item functioning (DIF) detection tests for a subgroup difference in item-level performance--if the null hypothesis of "no DIF" is rejected, the item is flagged for DIF. Conversely, an item is kept in the test form if there is insufficient evidence of DIF. We present frequentist and empirical…

Descriptors: Test Bias, Hypothesis Testing, Bayesian Statistics, Statistical Analysis

Discussion of David Thissen's Bad Questions: An Essay Involving Item Response Theory

Peer reviewed

Direct link

Ackerman, Terry – Journal of Educational and Behavioral Statistics, 2016

In this commentary, University of North Carolina's associate dean of research and assessment at the School of Education Terry Ackerman poses questions and shares his thoughts on David Thissen's essay, "Bad Questions: An Essay Involving Item Response Theory" (this issue). Ackerman begins by considering the two purposes of Item Response…

Descriptors: Item Response Theory, Test Items, Selection, Scores

Detection of Uniform and Nonuniform Differential Item Functioning by Item-Focused Trees

Peer reviewed

Direct link

Berger, Moritz; Tutz, Gerhard – Journal of Educational and Behavioral Statistics, 2016

Detection of differential item functioning (DIF) by use of the logistic modeling approach has a long tradition. One big advantage of the approach is that it can be used to investigate nonuniform (NUDIF) as well as uniform DIF (UDIF). The classical approach allows one to detect DIF by distinguishing between multiple groups. We propose an…

Descriptors: Test Bias, Regression (Statistics), Nonparametric Statistics, Statistical Analysis

Detection of Differential Item Functioning Using the Lasso Approach

Peer reviewed

Direct link

Magis, David; Tuerlinckx, Francis; De Boeck, Paul – Journal of Educational and Behavioral Statistics, 2015

This article proposes a novel approach to detect differential item functioning (DIF) among dichotomously scored items. Unlike standard DIF methods that perform an item-by-item analysis, we propose the "LR lasso DIF method": logistic regression (LR) model is formulated for all item responses. The model contains item-specific intercepts,…

Descriptors: Test Bias, Test Items, Regression (Statistics), Scores

Real and Artificial Differential Item Functioning

Peer reviewed

Direct link

Andrich, David; Hagquist, Curt – Journal of Educational and Behavioral Statistics, 2012

The literature in modern test theory on procedures for identifying items with differential item functioning (DIF) among two groups of persons includes the Mantel-Haenszel (MH) procedure. Generally, it is not recognized explicitly that if there is real DIF in some items which favor one group, then as an artifact of this procedure, artificial DIF…

Descriptors: Test Bias, Test Items, Item Response Theory, Statistical Analysis

Modeling Differential Item Functioning Using a Generalization of the Multiple-Group Bifactor Model

Peer reviewed

Direct link

Jeon, Minjeong; Rijmen, Frank; Rabe-Hesketh, Sophia – Journal of Educational and Behavioral Statistics, 2013

The authors present a generalization of the multiple-group bifactor model that extends the classical bifactor model for categorical outcomes by relaxing the typical assumption of independence of the specific dimensions. In addition to the means and variances of all dimensions, the correlations among the specific dimensions are allowed to differ…

Descriptors: Test Bias, Generalization, Models, Item Response Theory

Measuring Student Ability, Classifying Schools, and Detecting Item Bias at School Level, Based on Student-Level Dichotomous Items

Peer reviewed

Direct link

Bennink, Margot; Croon, Marcel A.; Keuning, Jos; Vermunt, Jeroen K. – Journal of Educational and Behavioral Statistics, 2014

In educational measurement, responses of students on items are used not only to measure the ability of students, but also to evaluate and compare the performance of schools. Analysis should ideally account for the multilevel structure of the data, and school-level processes not related to ability, such as working climate and administration…

Descriptors: Academic Ability, Educational Assessment, Educational Testing, Test Bias

Previous Page | Next Page »

Pages: 1 | 2 | 3

Dorans, Neil J.	3
Sinharay, Sandip	3
De Boeck, Paul	2
Wainer, Howard	2
Ackerman, Terry	1
Andrew D. Ho	1
Andrich, David	1
Ariel, Adelaide	1
Bennink, Margot	1
Berger, Moritz	1
Blew, Edwin O.	1
Bradlow, Eric	1
Briggs, Derek C.	1
Camparo, James	1
Camparo, Lorinda B.	1
Casabianca, Jodi M.	1
Croon, Marcel A.	1
Daniel J. Bauer	1
DeMars, Christine E.	1
Eckerly, Carol	1
Emons, Wilco H. M.	1
Finkelman, Matthew D.	1
Gamerman, Dani	1
Goncalves, Flavio B.	1
Gorney, Kylie	1
More ▼