ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	7

Descriptor

Classification	10
Comparative Analysis	10
Test Length	10
Adaptive Testing	4
Computer Assisted Testing	4
Item Response Theory	4
Reliability	4
Test Items	4
Accuracy	3
Cutting Scores	3
Decision Making	3
Higher Education	3
Latent Trait Theory	3
Probability	3
Bayesian Statistics	2
Computation	2
Criterion Referenced Tests	2
Goodness of Fit	2
Item Analysis	2
Item Banks	2
Mastery Tests	2
Mathematical Models	2
Maximum Likelihood Statistics	2
Measurement Techniques	2
Simulation	2
More ▼

Source

Educational and Psychological…	3
ProQuest LLC	2
Educational Research and…	1
Journal of Psychoeducational…	1

Author

Paek, Insu	2
Allan S. Cohen	1
Bradshaw, Laine	1
Deng, Nina	1
Eggen, Theo J. H. M.	1
Frick, Theodore W.	1
Haladyna, Tom	1
Huggins-Manley, Anne Corinne	1
Kim, Jiseon	1
Lee, Jihyun	1
Liu, Ren	1
Reckase, Mark D.	1
Roid, Gale	1
Sedat Sen	1
Wilson, Mark	1
More ▼

Publication Type

Reports - Research	7
Journal Articles	5
Dissertations/Theses -…	2
Speeches/Meeting Papers	2
Reports - Evaluative	1

Education Level

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 10 results Save | Export

An Evaluation of Fit Indices Used in Model Selection of Dichotomous Mixture IRT Models

Peer reviewed

Direct link

Sedat Sen; Allan S. Cohen – Educational and Psychological Measurement, 2024

A Monte Carlo simulation study was conducted to compare fit indices used for detecting the correct latent class in three dichotomous mixture item response theory (IRT) models. Ten indices were considered: Akaike's information criterion (AIC), the corrected AIC (AICc), Bayesian information criterion (BIC), consistent AIC (CAIC), Draper's…

Descriptors: Goodness of Fit, Item Response Theory, Sample Size, Classification

The Impact of Q-Matrix Designs on Diagnostic Classification Accuracy in the Presence of Attribute Hierarchies

Peer reviewed

Direct link

Liu, Ren; Huggins-Manley, Anne Corinne; Bradshaw, Laine – Educational and Psychological Measurement, 2017

There is an increasing demand for assessments that can provide more fine-grained information about examinees. In response to the demand, diagnostic measurement provides students with feedback on their strengths and weaknesses on specific skills by classifying them into mastery or nonmastery attribute categories. These attributes often form a…

Descriptors: Matrices, Classification, Accuracy, Diagnostic Tests

In Search of the Optimal Number of Response Categories in a Rating Scale

Peer reviewed

Direct link

Lee, Jihyun; Paek, Insu – Journal of Psychoeducational Assessment, 2014

Likert-type rating scales are still the most widely used method when measuring psychoeducational constructs. The present study investigates a long-standing issue of identifying the optimal number of response categories. A special emphasis is given to categorical data, which were generated by the Item Response Theory (IRT) Graded-Response Modeling…

Descriptors: Likert Scales, Responses, Item Response Theory, Classification

Computerized Classification Testing with the Rasch Model

Peer reviewed

Direct link

Eggen, Theo J. H. M. – Educational Research and Evaluation, 2011

If classification in a limited number of categories is the purpose of testing, computerized adaptive tests (CATs) with algorithms based on sequential statistical testing perform better than estimation-based CATs (e.g., Eggen & Straetmans, 2000). In these computerized classification tests (CCTs), the Sequential Probability Ratio Test (SPRT) (Wald,…

Descriptors: Test Length, Adaptive Testing, Classification, Item Analysis

Formulating the Rasch Differential Item Functioning Model under the Marginal Maximum Likelihood Estimation Context and Its Comparison with Mantel-Haenszel Procedure in Short Test and Small Sample Conditions

Peer reviewed

Direct link

Paek, Insu; Wilson, Mark – Educational and Psychological Measurement, 2011

This study elaborates the Rasch differential item functioning (DIF) model formulation under the marginal maximum likelihood estimation context. Also, the Rasch DIF model performance was examined and compared with the Mantel-Haenszel (MH) procedure in small sample and short test length conditions through simulations. The theoretically known…

Descriptors: Test Bias, Test Length, Statistical Inference, Geometric Concepts

Evaluating IRT- and CTT-Based Methods of Estimating Classification Consistency and Accuracy Indices from Single Administrations

Direct link

Deng, Nina – ProQuest LLC, 2011

Three decision consistency and accuracy (DC/DA) methods, the Livingston and Lewis (LL) method, LEE method, and the Hambleton and Han (HH) method, were evaluated. The purposes of the study were: (1) to evaluate the accuracy and robustness of these methods, especially when their assumptions were not well satisfied, (2) to investigate the "true"…

Descriptors: Item Response Theory, Test Theory, Computation, Classification

A Comparison of Computer-Based Classification Testing Approaches Using Mixed-Format Tests with the Generalized Partial Credit Model

Direct link

Kim, Jiseon – ProQuest LLC, 2010

Classification testing has been widely used to make categorical decisions by determining whether an examinee has a certain degree of ability required by established standards. As computer technologies have developed, classification testing has become more computerized. Several approaches have been proposed and investigated in the context of…

Descriptors: Test Length, Computer Assisted Testing, Classification, Probability

The Use of the Sequential Probability Ratio Test in Making Grade Classifications in Conjunction with Tailored Testing.

Download full text

Reckase, Mark D. – 1981

This report describes a study comparing the classification results obtained from a one-parameter and three-parameter logistic based tailored testing procedure used in conjunction with Wald's sequential probability ratio test (SPRT). Eighty-eight college students were classified into four grade categories using achievement test results obtained…

Descriptors: Adaptive Testing, Classification, Comparative Analysis, Computer Assisted Testing

A Comparison of Decision-Making Methods for Criterion-Referenced Tests.

Haladyna, Tom; Roid, Gale – 1980

The problems associated with misclassifying students when pass-fail decisions are based on test scores are discussed. One protection against misclassification is to set a confidence interval around the cutting score. Those whose scores fall above the interval are passed; those whose scores fall below the interval are failed; and those whose scores…

Descriptors: Bayesian Statistics, Classification, Comparative Analysis, Criterion Referenced Tests

An Investigation of the Validity of the Sequential Probability Ratio Test for Mastery Decisions in Criterion-Referenced Testing.

Download full text

Frick, Theodore W. – 1986

The sequential probability ratio test (SPRT), developed by Abraham Wald, is one statistical model available for making mastery decisions during computer-based criterion referenced tests. The predictive validity of the SPRT was empirically investigated with two different and relatively large item pools with heterogeneous item parameters. Graduate…

Descriptors: Achievement Tests, Adaptive Testing, Classification, Comparative Analysis