ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	3
Since 2017 (last 10 years)	6
Since 2007 (last 20 years)	13

Descriptor

Difficulty Level	18
Sample Size	18
Test Length	18
Item Response Theory	13
Test Items	13
Comparative Analysis	7
Correlation	7
Equated Scores	5
Simulation	5
Test Format	5
Accuracy	4
Computation	4
Error of Measurement	3
Guessing (Tests)	3
Monte Carlo Methods	3
Sampling	3
Statistical Analysis	3
True Scores	3
Ability	2
Adaptive Testing	2
Estimation (Mathematics)	2
Guidelines	2
Item Bias	2
Markov Processes	2
Probability	2
More ▼

Source

ProQuest LLC	5
Educational and Psychological…	2
Applied Measurement in…	1
Applied Psychological…	1
ETS Research Report Series	1
Journal of Educational and…	1
Journal of Experimental…	1
Quality Assurance in…	1

Publication Type

Reports - Research	10
Journal Articles	8
Dissertations/Theses -…	5
Speeches/Meeting Papers	4
Reports - Evaluative	3

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Comprehensive Tests of Basic…

What Works Clearinghouse Rating

Showing 1 to 15 of 18 results Save | Export

What Affects the Quality of Score Transformations? Potential Issues in True-Score Equating Using the Partial Credit Model

Peer reviewed

Direct link

Fellinghauer, Carolina; Debelak, Rudolf; Strobl, Carolin – Educational and Psychological Measurement, 2023

This simulation study investigated to what extent departures from construct similarity as well as differences in the difficulty and targeting of scales impact the score transformation when scales are equated by means of concurrent calibration using the partial credit model with a common person design. Practical implications of the simulation…

Descriptors: True Scores, Equated Scores, Test Items, Sample Size

IRT Models for Learning with Item-Specific Learning Parameters

Peer reviewed

Direct link

Yu, Albert; Douglas, Jeffrey A. – Journal of Educational and Behavioral Statistics, 2023

We propose a new item response theory growth model with item-specific learning parameters, or ISLP, and two variations of this model. In the ISLP model, either items or blocks of items have their own learning parameters. This model may be used to improve the efficiency of learning in a formative assessment. We show ways that the ISLP model's…

Descriptors: Item Response Theory, Learning, Markov Processes, Monte Carlo Methods

Can Auxiliary Information Improve Rasch Estimation at Small Sample Sizes?

Direct link

Derek Sauder – ProQuest LLC, 2020

The Rasch model is commonly used to calibrate multiple choice items. However, the sample sizes needed to estimate the Rasch model can be difficult to attain (e.g., consider a small testing company trying to pretest new items). With small sample sizes, auxiliary information besides the item responses may improve estimation of the item parameters.…

Descriptors: Item Response Theory, Sample Size, Computation, Test Length

Closed Formula of Test Length Required for Adaptive Testing with Medium Probability of Solution

Peer reviewed

Direct link

Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023

Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…

Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level

Subscore Equating and Profile Reporting

Peer reviewed

Direct link

Lim, Euijin; Lee, Won-Chan – Applied Measurement in Education, 2020

The purpose of this study is to address the necessity of subscore equating and to evaluate the performance of various equating methods for subtests. Assuming the random groups design and number-correct scoring, this paper analyzed real data and simulated data with four study factors including test dimensionality, subtest length, form difference in…

Descriptors: Equated Scores, Test Length, Test Format, Difficulty Level

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Dimensionality in Compensatory MIRT When Complex Structure Exists: Evaluation of DETECT and NOHARM

Peer reviewed

Direct link

Svetina, Dubravka; Levy, Roy – Journal of Experimental Education, 2016

This study investigated the effect of complex structure on dimensionality assessment in compensatory multidimensional item response models using DETECT- and NOHARM-based methods. The performance was evaluated via the accuracy of identifying the correct number of dimensions and the ability to accurately recover item groupings using a simple…

Descriptors: Item Response Theory, Accuracy, Correlation, Sample Size

Accuracy and Variability of Item Parameter Estimates from Marginal Maximum a Posteriori Estimation and Bayesian Inference via Gibbs Samplers

Direct link

Wu, Yi-Fang – ProQuest LLC, 2015

Item response theory (IRT) uses a family of statistical models for estimating stable characteristics of items and examinees and defining how these characteristics interact in describing item and test performance. With a focus on the three-parameter logistic IRT (Birnbaum, 1968; Lord, 1980) model, the current study examines the accuracy and…

Descriptors: Item Response Theory, Test Items, Accuracy, Computation

Equating Multidimensional Tests under a Random Groups Design: A Comparison of Various Equating Procedures

Direct link

Lee, Eunjung – ProQuest LLC, 2013

The purpose of this research was to compare the equating performance of various equating procedures for the multidimensional tests. To examine the various equating procedures, simulated data sets were used that were generated based on a multidimensional item response theory (MIRT) framework. Various equating procedures were examined, including…

Descriptors: Equated Scores, Tests, Comparative Analysis, Item Response Theory

An Investigation of Sample Size Splitting on ATFIND and DIMTEST

Peer reviewed

Direct link

Socha, Alan; DeMars, Christine E. – Educational and Psychological Measurement, 2013

Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…

Descriptors: Sample Size, Test Length, Correlation, Test Format

Conditions Affecting the Accuracy of Classical Equating Methods for Small Samples under the NEAT Design: A Simulation Study

Direct link

Sunnassee, Devdass – ProQuest LLC, 2011

Small sample equating remains a largely unexplored area of research. This study attempts to fill in some of the research gaps via a large-scale, IRT-based simulation study that evaluates the performance of seven small-sample equating methods under various test characteristic and sampling conditions. The equating methods considered are typically…

Descriptors: Test Length, Test Format, Sample Size, Simulation

Comparing Accuracy of Parameter Estimation Using IRT Models in the Presence of Guessing

Direct link

Fu, Qiong – ProQuest LLC, 2010

This research investigated how the accuracy of person ability and item difficulty parameter estimation varied across five IRT models with respect to the presence of guessing, targeting, and varied combinations of sample sizes and test lengths. The data were simulated with 50 replications under each of the 18 combined conditions. Five IRT models…

Descriptors: Item Response Theory, Guessing (Tests), Accuracy, Computation

Item Parameter Estimation for the MIRT Model: Bias and Precision of Confirmatory Factor Analysis-Based Models

Peer reviewed

Direct link

Finch, Holmes – Applied Psychological Measurement, 2010

The accuracy of item parameter estimates in the multidimensional item response theory (MIRT) model context is one that has not been researched in great detail. This study examines the ability of two confirmatory factor analysis models specifically for dichotomous data to properly estimate item parameters using common formulae for converting factor…

Descriptors: Item Response Theory, Computation, Factor Analysis, Models

An Analytical Evaluation of Two Common-Odds Ratios as Population Indicators of DIF.

Download full text

Pommerich, Mary; And Others – 1995

The Mantel-Haenszel (MH) statistic for identifying differential item functioning (DIF) commonly conditions on the observed test score as a surrogate for conditioning on latent ability. When the comparison group distributions are not completely overlapping (i.e., are incongruent), the observed score represents different levels of latent ability…

Descriptors: Ability, Comparative Analysis, Difficulty Level, Item Bias

A Comparison of Procedures for Ability Estimation under the Graded Response Model.

Download full text

Seong, Tae-Je; And Others – 1997

This study was designed to compare the accuracy of three commonly used ability estimation procedures under the graded response model. The three methods, maximum likelihood (ML), expected a posteriori (EAP), and maximum a posteriori (MAP), were compared using a recovery study design for two sample sizes, two underlying ability distributions, and…

Descriptors: Ability, Comparative Analysis, Difficulty Level, Estimation (Mathematics)

Previous Page | Next Page »

Pages: 1 | 2

De Ayala, R. J.	1
DeMars, Christine E.	1
Debelak, Rudolf	1
Derek Sauder	1
Dorans, Neil J.	1
Douglas, Jeffrey A.	1
Fellinghauer, Carolina	1
Finch, Holmes	1
Fu, Qiong	1
Gialluca, Kathleen A.	1
Guo, Hongwen	1
Huynh, Huynh	1
Kárász, Judit T.	1
Lee, Eunjung	1
Lee, Won-Chan	1
Levy, Roy	1
Lim, Euijin	1
Lu, Ru	1
Pommerich, Mary	1
Saunders, Joseph C.	1
Seong, Tae-Je	1
Socha, Alan	1
Strobl, Carolin	1
Sunnassee, Devdass	1
More ▼