ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	3
Since 2017 (last 10 years)	13

Descriptor

Simulation	11
Item Response Theory	8
Test Items	7
Comparative Analysis	6
Models	5
Diagnostic Tests	3
Scores	3
Tests	3
Computer Simulation	2
Decision Making	2
Difficulty Level	2
Evaluation Methods	2
Item Analysis	2
Longitudinal Studies	2
Mastery Learning	2
Measurement	2
Test Bias	2
Test Reliability	2
Ability	1
Accuracy	1
Achievement Tests	1
Adaptive Testing	1
Algebra	1
Attribution Theory	1
Bayesian Statistics	1
More ▼

Source

Educational Measurement:…

Publication Type

Journal Articles	13
Reports - Research	10
Reports - Descriptive	3

Education Level

Higher Education	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Program for International…

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Investigating Approaches to Controlling Item Position Effects in Computerized Adaptive Tests

Peer reviewed

Direct link

Ye Ma; Deborah J. Harris – Educational Measurement: Issues and Practice, 2025

Item position effect (IPE) refers to situations where an item performs differently when it is administered in different positions on a test. The majority of previous research studies have focused on investigating IPE under linear testing. There is a lack of IPE research under adaptive testing. In addition, the existence of IPE might violate Item…

Descriptors: Computer Assisted Testing, Adaptive Testing, Item Response Theory, Test Items

An Investigation of the Nature and Consequence of the Relationship between IRT Difficulty and Discrimination

Peer reviewed

Direct link

Sweeney, Sandra M.; Sinharay, Sandip; Johnson, Matthew S.; Steinhauer, Eric W. – Educational Measurement: Issues and Practice, 2022

The focus of this paper is on the empirical relationship between item difficulty and item discrimination. Two studies--an empirical investigation and a simulation study--were conducted to examine the association between item difficulty and item discrimination under classical test theory and item response theory (IRT), and the effects of the…

Descriptors: Correlation, Item Response Theory, Item Analysis, Difficulty Level

Modeling Slipping Effects in a Large-Scale Assessment with Innovative Item Formats

Peer reviewed

Direct link

Cuhadar, Ismail; Binici, Salih – Educational Measurement: Issues and Practice, 2022

This study employs the 4-parameter logistic item response theory model to account for the unexpected incorrect responses or slipping effects observed in a large-scale Algebra 1 End-of-Course assessment, including several innovative item formats. It investigates whether modeling the misfit at the upper asymptote has any practical impact on the…

Descriptors: Item Response Theory, Measurement, Student Evaluation, Algebra

A Longitudinal Diagnostic Model with Hierarchical Learning Trajectories

Peer reviewed

Direct link

Zhan, Peida; He, Keren – Educational Measurement: Issues and Practice, 2021

In learning diagnostic assessments, the attribute hierarchy specifies a sequential network of interrelated attribute mastery processes, which makes a test blueprint consistent with the cognitive theory. One of the most important functions of attribute hierarchy is to guide or limit the developmental direction of students and then form a…

Descriptors: Longitudinal Studies, Models, Comparative Analysis, Diagnostic Tests

Digital Module 13: Monte Carlo Simulation Studies in Item Response Theory

Peer reviewed

Direct link

Leventhal, Brian; Ames, Allison – Educational Measurement: Issues and Practice, 2020

In this digital ITEMS module, Dr. Brian Leventhal and Dr. Allison Ames provide an overview of "Monte Carlo simulation studies" (MCSS) in "item response theory" (IRT). MCSS are utilized for a variety of reasons, one of the most compelling being that they can be used when analytic solutions are impractical or nonexistent because…

Descriptors: Item Response Theory, Monte Carlo Methods, Simulation, Test Items

How Well Does the Sum Score Summarize the Test? Summability as a Measure of Internal Consistency

Peer reviewed

Direct link

Goeman, J. J.; De Jong, N. H. – Educational Measurement: Issues and Practice, 2018

Many researchers use Cronbach's alpha to demonstrate internal consistency, even though it has been shown numerous times that Cronbach's alpha is not suitable for this. Because the intention of questionnaire and test constructers is to summarize the test by its overall sum score, we advocate summability, which we define as the proportion of total…

Descriptors: Tests, Scores, Questionnaires, Measurement

A Model-Data-Fit-Informed Approach to Score Resolution in Performance Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021

Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…

Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making

Systematic Comparison of Decision Accuracy of Complex Compensatory Decision Rules Combining Multiple Tests in a Higher Education Context

Peer reviewed

Direct link

Yocarini, Iris E.; Bouwmeester, Samantha; Smeets, Guus; Arends, Lidia R. – Educational Measurement: Issues and Practice, 2018

This real-data-guided simulation study systematically evaluated the decision accuracy of complex decision rules combining multiple tests within different realistic curricula. Specifically, complex decision rules combining conjunctive aspects and compensatory aspects were evaluated. A conjunctive aspect requires a minimum level of performance,…

Descriptors: Comparative Analysis, Decision Making, Accuracy, Higher Education

A Technical Note on IRT Simulation Studies: Dealing with Truth, Estimates, Observed Data, and Residuals

Peer reviewed

Direct link

Luecht, Richard; Ackerman, Terry A. – Educational Measurement: Issues and Practice, 2018

Simulation studies are extremely common in the item response theory (IRT) research literature. This article presents a didactic discussion of "truth" and "error" in IRT-based simulation studies. We ultimately recommend that future research focus less on the simple recovery of parameters from a convenient generating IRT model,…

Descriptors: Item Response Theory, Simulation, Ethics, Error of Measurement

Digital Module 08: Foundations of Operational Item Analysis https://ncme.elevate.commpartners.com

Peer reviewed

Direct link

Yoo, Hanwook; Hambleton, Ronald K. – Educational Measurement: Issues and Practice, 2019

Item analysis is an integral part of operational test development and is typically conducted within two popular statistical frameworks: classical test theory (CTT) and item response theory (IRT). In this digital ITEMS module, Hanwook Yoo and Ronald K. Hambleton provide an accessible overview of operational item analysis approaches within these…

Descriptors: Item Analysis, Item Response Theory, Guidelines, Test Construction

Reliably Assessing Growth with Longitudinal Diagnostic Classification Models

Peer reviewed

Direct link

Madison, Matthew J. – Educational Measurement: Issues and Practice, 2019

Recent advances have enabled diagnostic classification models (DCMs) to accommodate longitudinal data. These longitudinal DCMs were developed to study how examinees change, or transition, between different attribute mastery statuses over time. This study examines using longitudinal DCMs as an approach to assessing growth and serves three purposes:…

Descriptors: Longitudinal Studies, Item Response Theory, Psychometrics, Criterion Referenced Tests

Measuring Widening Proficiency Differences in International Assessments: Are Current Approaches Enough?

Peer reviewed

Direct link

Rutkowski, David; Rutkowski, Leslie; Liaw, Yuan-Ling – Educational Measurement: Issues and Practice, 2018

Participation in international large-scale assessments has grown over time with the largest, the Programme for International Student Assessment (PISA), including more than 70 education systems that are economically and educationally diverse. To help accommodate for large achievement differences among participants, in 2009 PISA offered…

Descriptors: Educational Assessment, Foreign Countries, Achievement Tests, Secondary School Students

Five Methods for Estimating Angoff Cut Scores with IRT

Peer reviewed

Direct link

Wyse, Adam E. – Educational Measurement: Issues and Practice, 2017

This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test…

Descriptors: Cutting Scores, Item Response Theory, Bayesian Statistics, Maximum Likelihood Statistics

Ackerman, Terry A.	1
Ames, Allison	1
Arends, Lidia R.	1
Binici, Salih	1
Bouwmeester, Samantha	1
Cuhadar, Ismail	1
De Jong, N. H.	1
Deborah J. Harris	1
Goeman, J. J.	1
Hambleton, Ronald K.	1
He, Keren	1
Johnson, Matthew S.	1
Leventhal, Brian	1
Liaw, Yuan-Ling	1
Luecht, Richard	1
Madison, Matthew J.	1
Rutkowski, David	1
Rutkowski, Leslie	1
Sinharay, Sandip	1
Smeets, Guus	1
Steinhauer, Eric W.	1
Sweeney, Sandra M.	1
Walker, A. Adrienne	1
Wind, Stefanie A.	1
Wyse, Adam E.	1
More ▼