NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)5
Since 2006 (last 20 years)11
Audience
Location
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 11 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Köhler, Carmen; Robitzsch, Alexander; Hartig, Johannes – Journal of Educational and Behavioral Statistics, 2020
Testing whether items fit the assumptions of an item response theory model is an important step in evaluating a test. In the literature, numerous item fit statistics exist, many of which show severe limitations. The current study investigates the root mean squared deviation (RMSD) item fit statistic, which is used for evaluating item fit in…
Descriptors: Test Items, Goodness of Fit, Statistics, Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Kang, Hyeon-Ah; Zheng, Yi; Chang, Hua-Hua – Journal of Educational and Behavioral Statistics, 2020
With the widespread use of computers in modern assessment, online calibration has become increasingly popular as a way of replenishing an item pool. The present study discusses online calibration strategies for a joint model of responses and response times. The study proposes likelihood inference methods for item paramter estimation and evaluates…
Descriptors: Adaptive Testing, Computer Assisted Testing, Item Response Theory, Reaction Time
Peer reviewed Peer reviewed
Direct linkDirect link
Maeda, Hotaka; Zhang, Bo – International Journal of Testing, 2017
The omega (?) statistic is reputed to be one of the best indices for detecting answer copying on multiple choice tests, but its performance relies on the accurate estimation of copier ability, which is challenging because responses from the copiers may have been contaminated. We propose an algorithm that aims to identify and delete the suspected…
Descriptors: Cheating, Test Items, Mathematics, Statistics
Kim, YoungKoung; DeCarlo, Lawrence T. – College Board, 2016
Because of concerns about test security, different test forms are typically used across different testing occasions. As a result, equating is necessary in order to get scores from the different test forms that can be used interchangeably. In order to assure the quality of equating, multiple equating methods are often examined. Various equity…
Descriptors: Equated Scores, Evaluation Methods, Sampling, Statistical Inference
Peer reviewed Peer reviewed
Direct linkDirect link
Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi – Educational and Psychological Measurement, 2014
When item parameter estimates are used to estimate the ability parameter in item response models, the standard error (SE) of the ability estimate must be corrected to reflect the error carried over from item calibration. For maximum likelihood (ML) ability estimates, a corrected asymptotic SE is available, but it requires a long test and the…
Descriptors: Sampling, Statistical Inference, Maximum Likelihood Statistics, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014
The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…
Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference
Woodruff, David; Wu, Yi-Fang – ACT, Inc., 2012
The purpose of this paper is to illustrate alpha's robustness and usefulness, using actual and simulated educational test data. The sampling properties of alpha are compared with the sampling properties of several other reliability coefficients: Guttman's lambda[subscript 2], lambda[subscript 4], and lambda[subscript 6]; test-retest reliability;…
Descriptors: Sampling, Test Reliability, Item Response Theory, Statistical Inference
Peer reviewed Peer reviewed
Direct linkDirect link
Skaggs, Gary; Wilkins, Jesse L. M.; Hein, Serge F. – International Journal of Testing, 2016
The purpose of this study was to explore the degree of grain size of the attributes and the sample sizes that can support accurate parameter recovery with the General Diagnostic Model (GDM) for a large-scale international assessment. In this resampling study, bootstrap samples were obtained from the 2003 Grade 8 TIMSS in Mathematics at varying…
Descriptors: Achievement Tests, Foreign Countries, Elementary Secondary Education, Science Achievement
Peer reviewed Peer reviewed
Direct linkDirect link
Gu, Fei; Skorupski, William P.; Hoyle, Larry; Kingston, Neal M. – Applied Psychological Measurement, 2011
Ramsay-curve item response theory (RC-IRT) is a nonparametric procedure that estimates the latent trait using splines, and no distributional assumption about the latent trait is required. For item parameters of the two-parameter logistic (2-PL), three-parameter logistic (3-PL), and polytomous IRT models, RC-IRT can provide more accurate estimates…
Descriptors: Intervals, Item Response Theory, Models, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Sueiro, Manuel J.; Abad, Francisco J. – Educational and Psychological Measurement, 2011
The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…
Descriptors: Goodness of Fit, Item Response Theory, Nonparametric Statistics, Probability
Choi, Sae Il – ProQuest LLC, 2009
This study used simulation (a) to compare the kernel equating method to traditional equipercentile equating methods under the equivalent-groups (EG) design and the nonequivalent-groups with anchor test (NEAT) design and (b) to apply the parametric bootstrap method for estimating standard errors of equating. A two-parameter logistic item response…
Descriptors: Item Response Theory, Comparative Analysis, Sampling, Statistical Inference