Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 10 |
Descriptor
Classification | 11 |
Computation | 11 |
Test Length | 11 |
Item Response Theory | 10 |
Simulation | 7 |
Accuracy | 6 |
Statistical Analysis | 5 |
Test Items | 5 |
Bayesian Statistics | 4 |
Cutting Scores | 4 |
Adaptive Testing | 3 |
More ▼ |
Author
Cheng, Ying | 2 |
Lathrop, Quinn N. | 2 |
Liu, Chen-Wei | 2 |
Wang, Wen-Chung | 2 |
Deng, Nina | 1 |
Finkelman, Matthew David | 1 |
Glas, Cees A. W. | 1 |
Hao, Shiqi | 1 |
Hendrawan, Irene | 1 |
Md Desa, Zairul Nor Deana | 1 |
Meijer, Rob R. | 1 |
More ▼ |
Publication Type
Journal Articles | 9 |
Reports - Research | 5 |
Reports - Evaluative | 4 |
Dissertations/Theses -… | 2 |
Education Level
High Schools | 1 |
Secondary Education | 1 |
Audience
Location
Michigan | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Lathrop, Quinn N.; Cheng, Ying – Journal of Educational Measurement, 2014
When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…
Descriptors: Cutting Scores, Classification, Computation, Nonparametric Statistics
Lathrop, Quinn N.; Cheng, Ying – Applied Psychological Measurement, 2013
Within the framework of item response theory (IRT), there are two recent lines of work on the estimation of classification accuracy (CA) rate. One approach estimates CA when decisions are made based on total sum scores, the other based on latent trait estimates. The former is referred to as the Lee approach, and the latter, the Rudner approach,…
Descriptors: Item Response Theory, Accuracy, Classification, Computation
Svetina, Dubravka – Educational and Psychological Measurement, 2013
The purpose of this study was to investigate the effect of complex structure on dimensionality assessment in noncompensatory multidimensional item response models using dimensionality assessment procedures based on DETECT (dimensionality evaluation to enumerate contributing traits) and NOHARM (normal ogive harmonic analysis robust method). Five…
Descriptors: Item Response Theory, Statistical Analysis, Computation, Test Length
Wang, Wen-Chung; Liu, Chen-Wei; Wu, Shiu-Lien – Applied Psychological Measurement, 2013
The random-threshold generalized unfolding model (RTGUM) was developed by treating the thresholds in the generalized unfolding model as random effects rather than fixed effects to account for the subjective nature of the selection of categories in Likert items. The parameters of the new model can be estimated with the JAGS (Just Another Gibbs…
Descriptors: Computer Assisted Testing, Adaptive Testing, Models, Bayesian Statistics
Md Desa, Zairul Nor Deana – ProQuest LLC, 2012
In recent years, there has been increasing interest in estimating and improving subscore reliability. In this study, the multidimensional item response theory (MIRT) and the bi-factor model were combined to estimate subscores, to obtain subscores reliability, and subscores classification. Both the compensatory and partially compensatory MIRT…
Descriptors: Item Response Theory, Computation, Reliability, Classification
Wyse, Adam E.; Hao, Shiqi – Applied Psychological Measurement, 2012
This article introduces two new classification consistency indices that can be used when item response theory (IRT) models have been applied. The new indices are shown to be related to Rudner's classification accuracy index and Guo's classification accuracy index. The Rudner- and Guo-based classification accuracy and consistency indices are…
Descriptors: Item Response Theory, Classification, Accuracy, Reliability
Paek, Insu; Wilson, Mark – Educational and Psychological Measurement, 2011
This study elaborates the Rasch differential item functioning (DIF) model formulation under the marginal maximum likelihood estimation context. Also, the Rasch DIF model performance was examined and compared with the Mantel-Haenszel (MH) procedure in small sample and short test length conditions through simulations. The theoretically known…
Descriptors: Test Bias, Test Length, Statistical Inference, Geometric Concepts
Wang, Wen-Chung; Liu, Chen-Wei – Educational and Psychological Measurement, 2011
The generalized graded unfolding model (GGUM) has been recently developed to describe item responses to Likert items (agree-disagree) in attitude measurement. In this study, the authors (a) developed two item selection methods in computerized classification testing under the GGUM, the current estimate/ability confidence interval method and the cut…
Descriptors: Computer Assisted Testing, Adaptive Testing, Classification, Item Response Theory
Deng, Nina – ProQuest LLC, 2011
Three decision consistency and accuracy (DC/DA) methods, the Livingston and Lewis (LL) method, LEE method, and the Hambleton and Han (HH) method, were evaluated. The purposes of the study were: (1) to evaluate the accuracy and robustness of these methods, especially when their assumptions were not well satisfied, (2) to investigate the "true"…
Descriptors: Item Response Theory, Test Theory, Computation, Classification
Finkelman, Matthew David – Applied Psychological Measurement, 2010
In sequential mastery testing (SMT), assessment via computer is used to classify examinees into one of two mutually exclusive categories. Unlike paper-and-pencil tests, SMT has the capability to use variable-length stopping rules. One approach to shortening variable-length tests is stochastic curtailment, which halts examination if the probability…
Descriptors: Mastery Tests, Computer Assisted Testing, Adaptive Testing, Test Length
Hendrawan, Irene; Glas, Cees A. W.; Meijer, Rob R. – Applied Psychological Measurement, 2005
The effect of person misfit to an item response theory model on a mastery/nonmastery decision was investigated. Furthermore, it was investigated whether the classification precision can be improved by identifying misfitting respondents using person-fit statistics. A simulation study was conducted to investigate the probability of a correct…
Descriptors: Probability, Statistics, Test Length, Simulation