ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	5
Since 2016 (last 10 years)	16
Since 2006 (last 20 years)	34

Descriptor

Difficulty Level	53
Item Response Theory	53
Simulation	53
Test Items	45
Sample Size	13
Comparative Analysis	12
Models	12
Computation	11
Equated Scores	10
Correlation	9
Error of Measurement	9
Bayesian Statistics	8
Computer Assisted Testing	8
Psychometrics	8
Ability	7
Adaptive Testing	7
Guessing (Tests)	7
Test Construction	7
Monte Carlo Methods	6
Test Length	6
Accuracy	5
Estimation (Mathematics)	5
Item Analysis	5
Probability	5
Statistical Analysis	5
More ▼

Publication Type

Journal Articles	38
Reports - Research	28
Reports - Evaluative	20
Speeches/Meeting Papers	9
Dissertations/Theses -…	4
Reports - Descriptive	1

Education Level

Elementary Education

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…

What Works Clearinghouse Rating

Showing 1 to 15 of 53 results Save | Export

The Accuracy of Estimating Parameters of Multiple-Choice Test Items, Following Item-Response Theory: A Simulation Study

Peer reviewed
PDF on ERIC

Download full text

Aiman Mohammad Freihat; Omar Saleh Bani Yassin – Educational Process: International Journal, 2025

Background/purpose: This study aimed to reveal the accuracy of estimation of multiple-choice test items parameters following the models of the item-response theory in measurement. Materials/methods: The researchers depended on the measurement accuracy indicators, which express the absolute difference between the estimated and actual values of the…

Descriptors: Accuracy, Computation, Multiple Choice Tests, Test Items

An Investigation of the Nature and Consequence of the Relationship between IRT Difficulty and Discrimination

Peer reviewed

Direct link

Sweeney, Sandra M.; Sinharay, Sandip; Johnson, Matthew S.; Steinhauer, Eric W. – Educational Measurement: Issues and Practice, 2022

The focus of this paper is on the empirical relationship between item difficulty and item discrimination. Two studies--an empirical investigation and a simulation study--were conducted to examine the association between item difficulty and item discrimination under classical test theory and item response theory (IRT), and the effects of the…

Descriptors: Correlation, Item Response Theory, Item Analysis, Difficulty Level

A Framework for Measuring the Amount of Adaptation of Rasch-Based Computerized Adaptive Tests

Peer reviewed

Direct link

Wyse, Adam E.; McBride, James R. – Journal of Educational Measurement, 2021

A key consideration when giving any computerized adaptive test (CAT) is how much adaptation is present when the test is used in practice. This study introduces a new framework to measure the amount of adaptation of Rasch-based CATs based on looking at the differences between the selected item locations (Rasch item difficulty parameters) of the…

Descriptors: Item Response Theory, Computer Assisted Testing, Adaptive Testing, Test Items

Can Auxiliary Information Improve Rasch Estimation at Small Sample Sizes?

Direct link

Derek Sauder – ProQuest LLC, 2020

The Rasch model is commonly used to calibrate multiple choice items. However, the sample sizes needed to estimate the Rasch model can be difficult to attain (e.g., consider a small testing company trying to pretest new items). With small sample sizes, auxiliary information besides the item responses may improve estimation of the item parameters.…

Descriptors: Item Response Theory, Sample Size, Computation, Test Length

Investigation of the Effect of Parameter Estimation and Classification Accuracy in Mixture IRT Models under Different Conditions

Peer reviewed
PDF on ERIC

Download full text

Saatcioglu, Fatima Munevver; Atar, Hakan Yavuz – International Journal of Assessment Tools in Education, 2022

This study aims to examine the effects of mixture item response theory (IRT) models on item parameter estimation and classification accuracy under different conditions. The manipulated variables of the simulation study are set as mixture IRT models (Rasch, 2PL, 3PL); sample size (600, 1000); the number of items (10, 30); the number of latent…

Descriptors: Accuracy, Classification, Item Response Theory, Programming Languages

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Careful with Those Priors: A Note on Bayesian Estimation in Two-Parameter Logistic Item Response Theory Models

Peer reviewed

Direct link

Marcoulides, Katerina M. – Measurement: Interdisciplinary Research and Perspectives, 2018

This study examined the use of Bayesian analysis methods for the estimation of item parameters in a two-parameter logistic item response theory model. Using simulated data under various design conditions with both informative and non-informative priors, the parameter recovery of Bayesian analysis methods were examined. Overall results showed that…

Descriptors: Bayesian Statistics, Item Response Theory, Probability, Difficulty Level

Computerized Adaptive Testing in Early Education: Exploring the Impact of Item Position Effects on Ability Estimation

Peer reviewed

Direct link

Albano, Anthony D.; Cai, Liuhan; Lease, Erin M.; McConnell, Scott R. – Journal of Educational Measurement, 2019

Studies have shown that item difficulty can vary significantly based on the context of an item within a test form. In particular, item position may be associated with practice and fatigue effects that influence item parameter estimation. The purpose of this research was to examine the relevance of item position specifically for assessments used in…

Descriptors: Test Items, Computer Assisted Testing, Item Analysis, Difficulty Level

On Using Simulations to Inform Decision Making during Instrument Development

Peer reviewed

Direct link

Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018

Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…

Descriptors: Simulation, Decision Making, Test Construction, Validity

Equating with Miditests Using IRT

Peer reviewed

Direct link

Fitzpatrick, Joseph; Skorupski, William P. – Journal of Educational Measurement, 2016

The equating performance of two internal anchor test structures--miditests and minitests--is studied for four IRT equating methods using simulated data. Originally proposed by Sinharay and Holland, miditests are anchors that have the same mean difficulty as the overall test but less variance in item difficulties. Four popular IRT equating methods…

Descriptors: Difficulty Level, Test Items, Comparative Analysis, Test Construction

Unidimensional IRT Item Parameter Estimates across Equivalent Test Forms with Confounding Specifications within Dimensions

Peer reviewed

Direct link

Matlock, Ki Lynn; Turner, Ronna – Educational and Psychological Measurement, 2016

When constructing multiple test forms, the number of items and the total test difficulty are often equivalent. Not all test developers match the number of items and/or average item difficulty within subcontent areas. In this simulation study, six test forms were constructed having an equal number of items and average item difficulty overall.…

Descriptors: Item Response Theory, Computation, Test Items, Difficulty Level

Effects of Various Simulation Conditions on Latent-Trait Estimates: A Simulation Study

Peer reviewed
PDF on ERIC

Download full text

Kogar, Hakan – International Journal of Assessment Tools in Education, 2018

The aim of this simulation study, determine the relationship between true latent scores and estimated latent scores by including various control variables and different statistical models. The study also aimed to compare the statistical models and determine the effects of different distribution types, response formats and sample sizes on latent…

Descriptors: Simulation, Context Effect, Computation, Statistical Analysis

Bayesian Estimation of Multidimensional Item Response Models. A Comparison of Analytic and Simulation Algorithms

Peer reviewed
PDF on ERIC

Download full text

Martin-Fernandez, Manuel; Revuelta, Javier – Psicologica: International Journal of Methodology and Experimental Psychology, 2017

This study compares the performance of two estimation algorithms of new usage, the Metropolis-Hastings Robins-Monro (MHRM) and the Hamiltonian MCMC (HMC), with two consolidated algorithms in the psychometric literature, the marginal likelihood via EM algorithm (MML-EM) and the Markov chain Monte Carlo (MCMC), in the estimation of multidimensional…

Descriptors: Bayesian Statistics, Item Response Theory, Models, Comparative Analysis

Reweighting Data in the Spirit of Tukey: Using Bayesian Posterior Probabilities as Rasch Residuals for Studying Misfit

Peer reviewed

Direct link

Dardick, William R.; Mislevy, Robert J. – Educational and Psychological Measurement, 2016

A new variant of the iterative "data = fit + residual" data-analytical approach described by Mosteller and Tukey is proposed and implemented in the context of item response theory psychometric models. Posterior probabilities from a Bayesian mixture model of a Rasch item response theory model and an unscalable latent class are expressed…

Descriptors: Bayesian Statistics, Probability, Data Analysis, Item Response Theory

The Effect of Anchor Test Construction on Scale Drift

Peer reviewed

Direct link

Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014

In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…

Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Journal of Educational…	9
Educational and Psychological…	5
Applied Psychological…	4
ProQuest LLC	4
Applied Measurement in…	3
ETS Research Report Series	2
International Journal of…	2
International Journal of…	2
Alberta Journal of…	1
American Journal of…	1
Educational Measurement:…	1
Educational Process:…	1
Hacettepe University Journal…	1
Journal of Applied Measurement	1
Journal of Outcome Measurement	1
Journal of Psychoeducational…	1
Measurement:…	1
Practical Assessment,…	1
Psicologica: International…	1
More ▼

Jin, Kuan-Yu	3
Wang, Wen-Chung	3
Li, Yuan H.	2
Sinharay, Sandip	2
Adams, Raymond J.	1
Aiman Mohammad Freihat	1
Albano, Anthony D.	1
Antal, Judit	1
Atar, Burcu	1
Atar, Hakan Yavuz	1
Berger, Martijn P. F.	1
Beyerlein, Michael M.	1
Cai, Liuhan	1
Carbonaro, Michael	1
Clauser, Brian E.	1
Cohen, Allan S.	1
Dardick, William R.	1
Dawber, Teresa	1
De Ayala, R. J.	1
DeMars, Christine E.	1
Derek Sauder	1
Dodeen, Hamzeh	1
Dorans, Neil J.	1
Duong, Minh Q.	1
More ▼