ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	8

Descriptor

Comparative Analysis	9
Monte Carlo Methods	9
Item Response Theory	7
Test Items	5
Accuracy	4
Bias	2
Computation	2
Difficulty Level	2
Error of Measurement	2
Maximum Likelihood Statistics	2
Models	2
Nonparametric Statistics	2
Simulation	2
Statistical Analysis	2
Test Format	2
Bayesian Statistics	1
Classification	1
Correlation	1
Educational Assessment	1
Elementary School Teachers	1
Evaluation Methods	1
Factor Analysis	1
Goodness of Fit	1
Grade 4	1
Grade 5	1
More ▼

Source

Applied Measurement in…

Author

Finch, Holmes	2
Beretvas, S. Natasha	1
Bolt, Daniel M.	1
Cho, Sun-Joo	1
Finch, W. Holmes	1
French, Brian F.	1
Koziol, Natalie A.	1
Lee, Wooyeol	1
Liang, Tie	1
Lixin Yuan	1
Minqiang Zhang	1
Monahan, Patrick	1
Murphy, Daniel L.	1
Shaojie Wang	1
Stage, Alan Kirk	1
Wells, Craig S.	1
Won-Chan Lee	1
More ▼

Publication Type

Journal Articles	9
Reports - Research	9

Education Level

Elementary Education	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Colorado	1
Florida	1
New York	1
North Carolina	1
Tennessee	1
Texas	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 9 results Save | Export

IRT Characteristic Curve Linking Methods Weighted by Information for Mixed-Format Tests

Peer reviewed

Direct link

Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024

To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…

Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement

A Comparison of Estimation Techniques for IRT Models with Small Samples

Peer reviewed

Direct link

Finch, Holmes; French, Brian F. – Applied Measurement in Education, 2019

The usefulness of item response theory (IRT) models depends, in large part, on the accuracy of item and person parameter estimates. For the standard 3 parameter logistic model, for example, these parameters include the item parameters of difficulty, discrimination, and pseudo-chance, as well as the person ability parameter. Several factors impact…

Descriptors: Item Response Theory, Accuracy, Test Items, Difficulty Level

Detection of Differential Item Functioning for More than Two Groups: A Monte Carlo Comparison of Methods

Peer reviewed

Direct link

Finch, W. Holmes – Applied Measurement in Education, 2016

Differential item functioning (DIF) assessment is a crucial component in test construction, serving as the primary way in which instrument developers ensure that measures perform in the same way for multiple groups within the population. When such is not the case, scores may not accurately reflect the trait of interest for all individuals in the…

Descriptors: Test Bias, Monte Carlo Methods, Comparative Analysis, Population Groups

The Consequences of Ignoring Item Parameter Drift in Longitudinal Item Response Models

Peer reviewed

Direct link

Lee, Wooyeol; Cho, Sun-Joo – Applied Measurement in Education, 2017

Utilizing a longitudinal item response model, this study investigated the effect of item parameter drift (IPD) on item parameters and person scores via a Monte Carlo study. Item parameter recovery was investigated for various IPD patterns in terms of bias and root mean-square error (RMSE), and percentage of time the 95% confidence interval covered…

Descriptors: Item Response Theory, Test Items, Bias, Computation

A Nonparametric Approach for Assessing Goodness-of-Fit of IRT Models in a Mixed Format Test

Peer reviewed

Direct link

Liang, Tie; Wells, Craig S. – Applied Measurement in Education, 2015

Investigating the fit of a parametric model plays a vital role in validating an item response theory (IRT) model. An area that has received little attention is the assessment of multiple IRT models used in a mixed-format test. The present study extends the nonparametric approach, proposed by Douglas and Cohen (2001), to assess model fit of three…

Descriptors: Nonparametric Statistics, Goodness of Fit, Item Response Theory, Test Format

Parameter Recovery and Classification Accuracy under Conditions of Testlet Dependency: A Comparison of the Traditional 2PL, Testlet, and Bi-Factor Models

Peer reviewed

Direct link

Koziol, Natalie A. – Applied Measurement in Education, 2016

Testlets, or groups of related items, are commonly included in educational assessments due to their many logistical and conceptual advantages. Despite their advantages, testlets introduce complications into the theory and practice of educational measurement. Responses to items within a testlet tend to be correlated even after controlling for…

Descriptors: Classification, Accuracy, Comparative Analysis, Models

A Comparison of Teacher Effectiveness Measures Calculated Using Three Multilevel Models for Raters Effects

Peer reviewed

Direct link

Murphy, Daniel L.; Beretvas, S. Natasha – Applied Measurement in Education, 2015

This study examines the use of cross-classified random effects models (CCrem) and cross-classified multiple membership random effects models (CCMMrem) to model rater bias and estimate teacher effectiveness. Effect estimates are compared using CTT versus item response theory (IRT) scaling methods and three models (i.e., conventional multilevel…

Descriptors: Teacher Effectiveness, Comparative Analysis, Hierarchical Linear Modeling, Test Theory

Comparison of Factor Simplicity Indices for Dichotomous Data: DETECT R, Bentler's Simplicity Index, and the Loading Simplicity Index

Peer reviewed

Direct link

Finch, Holmes; Stage, Alan Kirk; Monahan, Patrick – Applied Measurement in Education, 2008

A primary assumption underlying several of the common methods for modeling item response data is unidimensionality, that is, test items tap into only one latent trait. This assumption can be assessed several ways, using nonlinear factor analysis and DETECT, a method based on the item conditional covariances. When multidimensionality is identified,…

Descriptors: Test Items, Factor Analysis, Item Response Theory, Comparative Analysis

A Monte Carlo Comparison of Parametric and Nonparametric Polytomous DIF Detection Methods.

Peer reviewed

Bolt, Daniel M. – Applied Measurement in Education, 2002

Compared two parametric procedures for detecting differential item functioning (DIF) using the graded response model (GRM), the GRM-likelihood ratio test and the GRM-differential functioning of items and tests, with a nonparametric DIF detection procedure, Poly-SIBTEST. Monte Carlo simulation results show that Poly-SIBTEST showed the least amount…

Descriptors: Comparative Analysis, Item Bias, Monte Carlo Methods, Nonparametric Statistics