ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	10

Descriptor

Classification	12
Simulation	12
Test Length	12
Item Response Theory	8
Computation	7
Accuracy	5
Test Items	5
Bayesian Statistics	4
Computer Assisted Testing	4
Cutting Scores	4
Probability	4
Statistical Analysis	4
Adaptive Testing	3
Goodness of Fit	3
Ability	2
Comparative Analysis	2
Computer Software	2
Educational Testing	2
Evaluation Methods	2
Mastery Tests	2
Maximum Likelihood Statistics	2
Sample Size	2
Test Bias	2
Computer Programs	1
Correlation	1
More ▼

Source

Applied Psychological…	5
Educational and Psychological…	3
Education Sciences	1
Journal of Educational…	1
ProQuest LLC	1

Publication Type

Journal Articles	10
Reports - Research	8
Reports - Evaluative	3
Dissertations/Theses -…	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 12 results Save | Export

The Effect of Person Misfit on Item Parameter Estimation and Classification Accuracy: A Simulation Study

Peer reviewed
PDF on ERIC

Download full text

Mousavi, Amin; Cui, Ying – Education Sciences, 2020

Often, important decisions regarding accountability and placement of students in performance categories are made on the basis of test scores generated from tests, therefore, it is important to evaluate the validity of the inferences derived from test results. One of the threats to the validity of such inferences is aberrant responding. Several…

Descriptors: Student Evaluation, Educational Testing, Psychological Testing, Item Response Theory

A Nonparametric Approach to Estimate Classification Accuracy and Consistency

Peer reviewed

Direct link

Lathrop, Quinn N.; Cheng, Ying – Journal of Educational Measurement, 2014

When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…

Descriptors: Cutting Scores, Classification, Computation, Nonparametric Statistics

Two Approaches to Estimation of Classification Accuracy Rate under Item Response Theory

Peer reviewed

Direct link

Lathrop, Quinn N.; Cheng, Ying – Applied Psychological Measurement, 2013

Within the framework of item response theory (IRT), there are two recent lines of work on the estimation of classification accuracy (CA) rate. One approach estimates CA when decisions are made based on total sum scores, the other based on latent trait estimates. The former is referred to as the Lee approach, and the latter, the Rudner approach,…

Descriptors: Item Response Theory, Accuracy, Classification, Computation

The Influence of Item Calibration Error on Variable-Length Computerized Adaptive Testing

Peer reviewed

Direct link

Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi – Applied Psychological Measurement, 2013

Variable-length computerized adaptive testing (VL-CAT) allows both items and test length to be "tailored" to examinees, thereby achieving the measurement goal (e.g., scoring precision or classification) with as few items as possible. Several popular test termination rules depend on the standard error of the ability estimate, which in turn depends…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Length, Ability

The Random-Threshold Generalized Unfolding Model and Its Application of Computerized Adaptive Testing

Peer reviewed

Direct link

Wang, Wen-Chung; Liu, Chen-Wei; Wu, Shiu-Lien – Applied Psychological Measurement, 2013

The random-threshold generalized unfolding model (RTGUM) was developed by treating the thresholds in the generalized unfolding model as random effects rather than fixed effects to account for the subjective nature of the selection of categories in Likert items. The parameters of the new model can be estimated with the JAGS (Just Another Gibbs…

Descriptors: Computer Assisted Testing, Adaptive Testing, Models, Bayesian Statistics

Performance of the S - [chi][squared] Statistic for Full-Information Bifactor Models

Peer reviewed

Direct link

Li, Ying; Rupp, Andre A. – Educational and Psychological Measurement, 2011

This study investigated the Type I error rate and power of the multivariate extension of the S - [chi][squared] statistic using unidimensional and multidimensional item response theory (UIRT and MIRT, respectively) models as well as full-information bifactor (FI-bifactor) models through simulation. Manipulated factors included test length, sample…

Descriptors: Test Length, Item Response Theory, Statistical Analysis, Error Patterns

Polytomous Adaptive Classification Testing: Effects of Item Pool Size, Test Termination Criterion, and Number of Cutscores

Peer reviewed

Direct link

Gnambs, Timo; Batinic, Bernad – Educational and Psychological Measurement, 2011

Computer-adaptive classification tests focus on classifying respondents in different proficiency groups (e.g., for pass/fail decisions). To date, adaptive classification testing has been dominated by research on dichotomous response formats and classifications in two groups. This article extends this line of research to polytomous classification…

Descriptors: Test Length, Computer Assisted Testing, Classification, Test Items

Formulating the Rasch Differential Item Functioning Model under the Marginal Maximum Likelihood Estimation Context and Its Comparison with Mantel-Haenszel Procedure in Short Test and Small Sample Conditions

Peer reviewed

Direct link

Paek, Insu; Wilson, Mark – Educational and Psychological Measurement, 2011

This study elaborates the Rasch differential item functioning (DIF) model formulation under the marginal maximum likelihood estimation context. Also, the Rasch DIF model performance was examined and compared with the Mantel-Haenszel (MH) procedure in small sample and short test length conditions through simulations. The theoretically known…

Descriptors: Test Bias, Test Length, Statistical Inference, Geometric Concepts

Evaluating IRT- and CTT-Based Methods of Estimating Classification Consistency and Accuracy Indices from Single Administrations

Direct link

Deng, Nina – ProQuest LLC, 2011

Three decision consistency and accuracy (DC/DA) methods, the Livingston and Lewis (LL) method, LEE method, and the Hambleton and Han (HH) method, were evaluated. The purposes of the study were: (1) to evaluate the accuracy and robustness of these methods, especially when their assumptions were not well satisfied, (2) to investigate the "true"…

Descriptors: Item Response Theory, Test Theory, Computation, Classification

Variations on Stochastic Curtailment in Sequential Mastery Testing

Peer reviewed

Direct link

Finkelman, Matthew David – Applied Psychological Measurement, 2010

In sequential mastery testing (SMT), assessment via computer is used to classify examinees into one of two mutually exclusive categories. Unlike paper-and-pencil tests, SMT has the capability to use variable-length stopping rules. One approach to shortening variable-length tests is stochastic curtailment, which halts examination if the probability…

Descriptors: Mastery Tests, Computer Assisted Testing, Adaptive Testing, Test Length

The Effect of Person Misfit on Classification Decisions

Peer reviewed

Direct link

Hendrawan, Irene; Glas, Cees A. W.; Meijer, Rob R. – Applied Psychological Measurement, 2005

The effect of person misfit to an item response theory model on a mastery/nonmastery decision was investigated. Furthermore, it was investigated whether the classification precision can be improved by identifying misfitting respondents using person-fit statistics. A simulation study was conducted to investigate the probability of a correct…

Descriptors: Probability, Statistics, Test Length, Simulation

A Bayesian Simulation for Determining Mastery Calssification Accuracy.

Download full text

Steinheiser, Frederick H., Jr. – 1976

A computer simulation of Bayes' Theorem was conducted in order to determine the probability that an examinee was a master conditional upon his test score. The inputs were: number of mastery states assumed, test length, prior expectation of masters in the examinee population, and conditional probability of a master getting a randomly selected test…

Descriptors: Bayesian Statistics, Classification, Computer Programs, Criterion Referenced Tests

Cheng, Ying	3
Lathrop, Quinn N.	2
Batinic, Bernad	1
Cui, Ying	1
Deng, Nina	1
Diao, Qi	1
Finkelman, Matthew David	1
Glas, Cees A. W.	1
Gnambs, Timo	1
Hendrawan, Irene	1
Li, Ying	1
Liu, Chen-Wei	1
Meijer, Rob R.	1
Mousavi, Amin	1
Paek, Insu	1
Patton, Jeffrey M.	1
Rupp, Andre A.	1
Steinheiser, Frederick H., Jr.	1
Wang, Wen-Chung	1
Wilson, Mark	1
Wu, Shiu-Lien	1
Yuan, Ke-Hai	1
More ▼