ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	8

Descriptor

Error of Measurement	9
Simulation	9
Item Response Theory	6
Test Items	5
Comparative Analysis	3
Sample Size	3
Difficulty Level	2
Equated Scores	2
Evaluation Methods	2
Item Analysis	2
Monte Carlo Methods	2
Psychometrics	2
Sampling	2
Statistical Bias	2
Test Construction	2
Accuracy	1
Achievement Tests	1
Bayesian Statistics	1
Classification	1
Computation	1
Data Analysis	1
Data Collection	1
Design	1
Elementary School Students	1
Estimation (Mathematics)	1
More ▼

Source

Applied Measurement in…

Author

Finch, Holmes	2
Abulela, Mohammed A. A.	1
Antal, Judit	1
Dallas, Andrew D.	1
Fan, Fen	1
Goodman, Joshua T.	1
James S. Kim	1
Jones, Andrew T.	1
Joshua B. Gilbert	1
Kim, Stella Yun	1
Kopp, Jason P.	1
Lee, Guemin	1
Lee, Won-Chan	1
Luke W. Miratrix	1
Melican, Gerald J.	1
Monahan, Patrick	1
Proctor, Thomas P.	1
Rios, Joseph A.	1
More ▼

Publication Type

Journal Articles	9
Reports - Research	9

Education Level

Early Childhood Education	1
Elementary Education	1
Grade 1	1
Grade 2	1
Grade 3	1
Primary Education	1
Secondary Education	1

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

Iowa Tests of Basic Skills	1
Program for International…	1

What Works Clearinghouse Rating

Showing all 9 results Save | Export

Comparison of Methods for Identifying Differential Step Functioning with Polytomous Item Response Data

Peer reviewed

Direct link

Finch, Holmes – Applied Measurement in Education, 2022

Much research has been devoted to identification of differential item functioning (DIF), which occurs when the item responses for individuals from two groups differ after they are conditioned on the latent trait being measured by the scale. There has been less work examining differential step functioning (DSF), which is present for polytomous…

Descriptors: Comparative Analysis, Item Response Theory, Item Analysis, Simulation

Maintaining Score Scales over Time: A Comparison of Five Scoring Methods

Peer reviewed

Direct link

Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023

This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…

Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation

Equating with Small and Unbalanced Samples

Peer reviewed

Direct link

Goodman, Joshua T.; Dallas, Andrew D.; Fan, Fen – Applied Measurement in Education, 2020

Recent research has suggested that re-setting the standard for each administration of a small sample examination, in addition to the high cost, does not adequately maintain similar performance expectations year after year. Small-sample equating methods have shown promise with samples between 20 and 30. For groups that have fewer than 20 students,…

Descriptors: Equated Scores, Sample Size, Sampling, Weighted Scores

Leveraging Item Parameter Drift to Assess Transfer Effects in Vocabulary Learning

Peer reviewed

Direct link

Joshua B. Gilbert; James S. Kim; Luke W. Miratrix – Applied Measurement in Education, 2024

Longitudinal models typically emphasize between-person predictors of change but ignore how growth varies "within" persons because each person contributes only one data point at each time. In contrast, modeling growth with multi-item assessments allows evaluation of how relative item performance may shift over time. While traditionally…

Descriptors: Vocabulary Development, Item Response Theory, Test Items, Student Development

Impact of Item Parameter Drift on Rasch Scale Stability in Small Samples over Multiple Administrations

Peer reviewed

Direct link

Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020

Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…

Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling

Comparing the Robustness of Three Nonparametric DIF Procedures to Differential Rapid Guessing

Peer reviewed

Direct link

Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022

When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…

Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis

The Effect of Anchor Test Construction on Scale Drift

Peer reviewed

Direct link

Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014

In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…

Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory

A Bootstrap Generalization of Modified Parallel Analysis for IRT Dimensionality Assessment

Peer reviewed

Direct link

Finch, Holmes; Monahan, Patrick – Applied Measurement in Education, 2008

This article introduces a bootstrap generalization to the Modified Parallel Analysis (MPA) method of test dimensionality assessment using factor analysis. This methodology, based on the use of Marginal Maximum Likelihood nonlinear factor analysis, provides for the calculation of a test statistic based on a parametric bootstrap using the MPA…

Descriptors: Monte Carlo Methods, Factor Analysis, Generalization, Methods

Estimating Conditional Standard Errors of Measurement for Tests Composed of Testlets.

Peer reviewed

Lee, Guemin – Applied Measurement in Education, 2000

Investigated incorporating a testlet definition into the estimation of the conditional standard error of measurement (SEM) for tests composed of testlets using five conditional SEM estimation methods. Results from 3,876 tests from the Iowa Tests of Basic Skills and 1,000 simulated responses show that item-based methods provide lower conditional…

Descriptors: Error of Measurement, Estimation (Mathematics), Simulation, Test Construction