ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	7
Since 2006 (last 20 years)	20

Descriptor

Comparative Analysis	24
Item Response Theory	12
Test Items	12
Monte Carlo Methods	9
Evaluation Methods	8
Computation	6
Methods	6
Simulation	5
Statistical Analysis	5
Accuracy	4
Scores	4
Bayesian Statistics	3
Computer Assisted Testing	3
Correlation	3
Educational Assessment	3
Error of Measurement	3
Maximum Likelihood Statistics	3
Measurement	3
Models	3
Nonparametric Statistics	3
Test Bias	3
Adaptive Testing	2
Bias	2
College Students	2
Difficulty Level	2
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	24
Reports - Research	18
Reports - Evaluative	4
Information Analyses	2
Reports - Descriptive	2

Education Level

Elementary Education	2
Grade 5	2
Elementary Secondary Education	1
Grade 10	1
Grade 4	1
Grade 6	1
Grade 7	1
Grade 8	1
Higher Education	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
Secondary Education	1
More ▼

Audience

Researchers

Location

Colorado	1
Florida	1
New York	1
North Carolina	1
Tennessee	1
Texas	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment

What Works Clearinghouse Rating

Showing 1 to 15 of 24 results Save | Export

IRT Characteristic Curve Linking Methods Weighted by Information for Mixed-Format Tests

Peer reviewed

Direct link

Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024

To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…

Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement

The Impact of Three Factors on the Recovery of Item Parameters for the Three-Parameter Logistic Model

Peer reviewed

Direct link

Kim, Kyung Yong; Lee, Won-Chan – Applied Measurement in Education, 2017

This article provides a detailed description of three factors (specification of the ability distribution, numerical integration, and frame of reference for the item parameter estimates) that might affect the item parameter estimation of the three-parameter logistic model, and compares five item calibration methods, which are combinations of the…

Descriptors: Test Items, Item Response Theory, Comparative Analysis, Methods

A Comparison of Estimation Techniques for IRT Models with Small Samples

Peer reviewed

Direct link

Finch, Holmes; French, Brian F. – Applied Measurement in Education, 2019

The usefulness of item response theory (IRT) models depends, in large part, on the accuracy of item and person parameter estimates. For the standard 3 parameter logistic model, for example, these parameters include the item parameters of difficulty, discrimination, and pseudo-chance, as well as the person ability parameter. Several factors impact…

Descriptors: Item Response Theory, Accuracy, Test Items, Difficulty Level

Detection of Differential Item Functioning for More than Two Groups: A Monte Carlo Comparison of Methods

Peer reviewed

Direct link

Finch, W. Holmes – Applied Measurement in Education, 2016

Differential item functioning (DIF) assessment is a crucial component in test construction, serving as the primary way in which instrument developers ensure that measures perform in the same way for multiple groups within the population. When such is not the case, scores may not accurately reflect the trait of interest for all individuals in the…

Descriptors: Test Bias, Monte Carlo Methods, Comparative Analysis, Population Groups

The Consequences of Ignoring Item Parameter Drift in Longitudinal Item Response Models

Peer reviewed

Direct link

Lee, Wooyeol; Cho, Sun-Joo – Applied Measurement in Education, 2017

Utilizing a longitudinal item response model, this study investigated the effect of item parameter drift (IPD) on item parameters and person scores via a Monte Carlo study. Item parameter recovery was investigated for various IPD patterns in terms of bias and root mean-square error (RMSE), and percentage of time the 95% confidence interval covered…

Descriptors: Item Response Theory, Test Items, Bias, Computation

A Nonparametric Approach for Assessing Goodness-of-Fit of IRT Models in a Mixed Format Test

Peer reviewed

Direct link

Liang, Tie; Wells, Craig S. – Applied Measurement in Education, 2015

Investigating the fit of a parametric model plays a vital role in validating an item response theory (IRT) model. An area that has received little attention is the assessment of multiple IRT models used in a mixed-format test. The present study extends the nonparametric approach, proposed by Douglas and Cohen (2001), to assess model fit of three…

Descriptors: Nonparametric Statistics, Goodness of Fit, Item Response Theory, Test Format

The Comparability of Scores from Different Digital Devices: A Literature Review and Synthesis with Recommendations for Practice

Peer reviewed

Direct link

Dadey, Nathan; Lyons, Susan; DePascale, Charles – Applied Measurement in Education, 2018

Evidence of comparability is generally needed whenever there are variations in the conditions of an assessment administration, including variations introduced by the administration of an assessment on multiple digital devices (e.g., tablet, laptop, desktop). This article is meant to provide a comprehensive examination of issues relevant to the…

Descriptors: Evaluation Methods, Computer Assisted Testing, Educational Technology, Technology Uses in Education

An Empirical Comparison of DDF Detection Methods for Understanding the Causes of DIF in Multiple-Choice Items

Peer reviewed

Direct link

Suh, Youngsuk; Talley, Anna E. – Applied Measurement in Education, 2015

This study compared and illustrated four differential distractor functioning (DDF) detection methods for analyzing multiple-choice items. The log-linear approach, two item response theory-model-based approaches with likelihood ratio tests, and the odds ratio approach were compared to examine the congruence among the four DDF detection methods.…

Descriptors: Test Bias, Multiple Choice Tests, Test Items, Methods

Parameter Recovery and Classification Accuracy under Conditions of Testlet Dependency: A Comparison of the Traditional 2PL, Testlet, and Bi-Factor Models

Peer reviewed

Direct link

Koziol, Natalie A. – Applied Measurement in Education, 2016

Testlets, or groups of related items, are commonly included in educational assessments due to their many logistical and conceptual advantages. Despite their advantages, testlets introduce complications into the theory and practice of educational measurement. Responses to items within a testlet tend to be correlated even after controlling for…

Descriptors: Classification, Accuracy, Comparative Analysis, Models

Item Selection and Ability Estimation Procedures for a Mixed-Format Adaptive Test

Peer reviewed

Direct link

Ho, Tsung-Han; Dodd, Barbara G. – Applied Measurement in Education, 2012

In this study we compared five item selection procedures using three ability estimation methods in the context of a mixed-format adaptive test based on the generalized partial credit model. The item selection procedures used were maximum posterior weighted information, maximum expected information, maximum posterior weighted Kullback-Leibler…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Selection

A Comparison of Teacher Effectiveness Measures Calculated Using Three Multilevel Models for Raters Effects

Peer reviewed

Direct link

Murphy, Daniel L.; Beretvas, S. Natasha – Applied Measurement in Education, 2015

This study examines the use of cross-classified random effects models (CCrem) and cross-classified multiple membership random effects models (CCMMrem) to model rater bias and estimate teacher effectiveness. Effect estimates are compared using CTT versus item response theory (IRT) scaling methods and three models (i.e., conventional multilevel…

Descriptors: Teacher Effectiveness, Comparative Analysis, Hierarchical Linear Modeling, Test Theory

The Utility of Augmented Subscores in a Licensure Exam: An Evaluation of Methods Using Empirical Data

Peer reviewed

Direct link

Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010

Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…

Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods

Comparison of Factor Simplicity Indices for Dichotomous Data: DETECT R, Bentler's Simplicity Index, and the Loading Simplicity Index

Peer reviewed

Direct link

Finch, Holmes; Stage, Alan Kirk; Monahan, Patrick – Applied Measurement in Education, 2008

A primary assumption underlying several of the common methods for modeling item response data is unidimensionality, that is, test items tap into only one latent trait. This assumption can be assessed several ways, using nonlinear factor analysis and DETECT, a method based on the item conditional covariances. When multidimensionality is identified,…

Descriptors: Test Items, Factor Analysis, Item Response Theory, Comparative Analysis

A Comparison of IRT Linking Procedures

Peer reviewed

Direct link

Lee, Won-Chan; Ban, Jae-Chun – Applied Measurement in Education, 2010

Various applications of item response theory often require linking to achieve a common scale for item parameter estimates obtained from different groups. This article used a simulation to examine the relative performance of four different item response theory (IRT) linking procedures in a random groups equating design: concurrent calibration with…

Descriptors: Item Response Theory, Simulation, Comparative Analysis, Measurement Techniques

Evaluation of the National Assessment of Educational Progress: Next Steps

Peer reviewed

Direct link

Noell, Jay; Ginsburg, Alan – Applied Measurement in Education, 2009

The report, "Evaluation of the National Assessment of Educational Progress", provides a number of recommendations for addressing validity concerns about NAEP. This article identifies actions that could be taken by the Congress, the National Center for Education Statistics, and the National Assessment Governing Board--which share responsibility for…

Descriptors: National Competency Tests, Federal Government, Public Agencies, Test Validity

Previous Page | Next Page »

Pages: 1 | 2

Bolt, Daniel M.	2
Finch, Holmes	2
Lee, Won-Chan	2
Awuor, Risper	1
Ban, Jae-Chun	1
Beretvas, S. Natasha	1
Blank, Rolf	1
Briggs, Derek C.	1
Cho, Sun-Joo	1
Clauser, Brian	1
Dadey, Nathan	1
DePascale, Charles	1
Dodd, Barbara G.	1
Finch, W. Holmes	1
French, Brian F.	1
Ginsburg, Alan	1
Haberman, Shelby	1
Hein, Serge F.	1
Ho, Tsung-Han	1
Kim, Kyung Yong	1
Koziol, Natalie A.	1
Larkin, Kevin	1
Lee, Wooyeol	1
Liang, Tie	1
More ▼