ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	7
Since 2006 (last 20 years)	13

Descriptor

Classification	23
Item Response Theory	23
Computation	6
Models	6
Accuracy	5
Psychometrics	5
Reliability	5
Computer Software	4
Rating Scales	4
Test Items	4
Educational Diagnosis	3
Maximum Likelihood Statistics	3
Probability	3
Scores	3
Ability	2
Achievement Tests	2
Bayesian Statistics	2
Computer Assisted Testing	2
Criterion Referenced Tests	2
Data Analysis	2
Educational Assessment	2
Elementary School Students	2
Error of Measurement	2
Foreign Countries	2
Measurement Techniques	2
More ▼

Source

Practical Assessment,…	3
Educational and Psychological…	2
Journal of Applied Measurement	2
Journal of Educational…	2
Journal of Educational and…	2
Journal of Outcome Measurement	2
Applied Psychological…	1
Educational Measurement:…	1
International Journal of…	1
Journal of Applied Testing…	1
Journal of Educational Data…	1
Journal of the American…	1
Measurement:…	1
Multivariate Behavioral…	1
More ▼

Publication Type

Reports - Descriptive	23
Journal Articles	21
Speeches/Meeting Papers	2
Guides - Non-Classroom	1
Reports - Evaluative	1

Education Level

Elementary Education	2
Secondary Education	2
Grade 8	1
Junior High Schools	1
Middle Schools	1

Audience

Researchers	2
Practitioners	1

Location

Belgium	1
China	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	1
Work Keys (ACT)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 23 results Save | Export

Estimating Probabilities of Passing for Examinees with Incomplete Data in Mastery Tests

Peer reviewed

Direct link

Sinharay, Sandip – Educational and Psychological Measurement, 2022

Administrative problems such as computer malfunction and power outage occasionally lead to missing item scores and hence to incomplete data on mastery tests such as the AP and U.S. Medical Licensing examinations. Investigators are often interested in estimating the probabilities of passing of the examinees with incomplete data on mastery tests.…

Descriptors: Mastery Tests, Computer Assisted Testing, Probability, Test Wiseness

Relative Diagnostic Profile: A Subscore Reporting Framework

Peer reviewed

Direct link

Liu, Ren; Qian, Hong; Luo, Xiao; Woo, Ada – Educational and Psychological Measurement, 2018

Subscore reporting under item response theory models has always been a challenge partly because the test length of each subdomain is limited for precisely locating individuals on multiple continua. Diagnostic classification models (DCMs), providing a pass/fail decision and associated probability of pass on each subdomain, are promising…

Descriptors: Classification, Probability, Pass Fail Grading, Scores

A Longitudinal Higher-Order Diagnostic Classification Model

Peer reviewed

Direct link

Zhan, Peida; Jiao, Hong; Liao, Dandan; Li, Feiming – Journal of Educational and Behavioral Statistics, 2019

Providing diagnostic feedback about growth is crucial to formative decisions such as targeted remedial instructions or interventions. This article proposed a longitudinal higher-order diagnostic classification modeling approach for measuring growth. The new modeling approach is able to provide quantitative values of overall and individual growth…

Descriptors: Classification, Growth Models, Educational Diagnosis, Models

Item Response Theory-Based Methods for Estimating Classification Accuracy and Consistency

Peer reviewed

Direct link

Diao, Hongyu; Sireci, Stephen G. – Journal of Applied Testing Technology, 2018

Whenever classification decisions are made on educational tests, such as pass/fail, or basic, proficient, or advanced, the consistency and accuracy of those decisions should be estimated and reported. Methods for estimating the reliability of classification decisions made on the basis of educational tests are well-established (e.g., Rudner, 2001;…

Descriptors: Classification, Item Response Theory, Accuracy, Reliability

Applying Psychometric Modeling to Aid Feature Engineering in Predictive Log-Data Analytics: The NAEP EDM Competition

Peer reviewed
PDF on ERIC

Download full text

Zehner, Fabian; Eichmann, Beate; Deribo, Tobias; Harrison, Scott; Bengs, Daniel; Andersen, Nico; Hahnel, Carolin – Journal of Educational Data Mining, 2021

The NAEP EDM Competition required participants to predict efficient test-taking behavior based on log data. This paper describes our top-down approach for engineering features by means of psychometric modeling, aiming at machine learning for the predictive classification task. For feature engineering, we employed, among others, the Log-Normal…

Descriptors: National Competency Tests, Engineering Education, Data Collection, Data Analysis

R Packages for Item Response Theory Analysis: Descriptions and Features

Peer reviewed

Direct link

Choi, Youn-Jeng; Asilkalkan, Abdullah – Measurement: Interdisciplinary Research and Perspectives, 2019

About 45 R packages to analyze data using item response theory (IRT) have been developed over the last decade. This article introduces these 45 R packages with their descriptions and features. It also describes possible advanced IRT models using R packages, as well as dichotomous and polytomous IRT models, and R packages that contain applications…

Descriptors: Item Response Theory, Data Analysis, Computer Software, Test Bias

Practical Issues in Estimating Classification Accuracy and Consistency with R Package cacIRT

Peer reviewed
PDF on ERIC

Download full text

Lathrop, Quinn N. – Practical Assessment, Research & Evaluation, 2015

There are two main lines of research in estimating classification accuracy (CA) and classification consistency (CC) under Item Response Theory (IRT). The R package cacIRT provides computer implementations of both approaches in an accessible and unified framework. Even with available implementations, there remains decisions a researcher faces when…

Descriptors: Classification, Accuracy, Item Response Theory, Reliability

Fitting the Reduced RUM with Mplus: A Tutorial

Peer reviewed

Direct link

Chiu, Chia-Yi; Köhn, Hans-Friedrich; Wu, Huey-Min – International Journal of Testing, 2016

The Reduced Reparameterized Unified Model (Reduced RUM) is a diagnostic classification model for educational assessment that has received considerable attention among psychometricians. However, the computational options for researchers and practitioners who wish to use the Reduced RUM in their work, but do not feel comfortable writing their own…

Descriptors: Educational Diagnosis, Classification, Models, Educational Assessment

Obtaining Diagnostic Classification Model Estimates Using Mplus

Peer reviewed

Direct link

Templin, Jonathan; Hoffman, Lesa – Educational Measurement: Issues and Practice, 2013

Diagnostic classification models (aka cognitive or skills diagnosis models) have shown great promise for evaluating mastery on a multidimensional profile of skills as assessed through examinee responses, but continued development and application of these models has been hindered by a lack of readily available software. In this article we…

Descriptors: Classification, Models, Language Tests, English (Second Language)

Classification Consistency and Accuracy for Complex Assessments Using Item Response Theory

Peer reviewed

Direct link

Lee, Won-Chan – Journal of Educational Measurement, 2010

In this article, procedures are described for estimating single-administration classification consistency and accuracy indices for complex assessments using item response theory (IRT). This IRT approach was applied to real test data comprising dichotomous and polytomous items. Several different IRT model combinations were considered. Comparisons…

Descriptors: Classification, Item Response Theory, Comparative Analysis, Models

Scoring and Classifying Examinees Using Measurement Decision Theory

Peer reviewed

Direct link

Rudner, Lawrence M. – Practical Assessment, Research & Evaluation, 2009

This paper describes and evaluates the use of measurement decision theory (MDT) to classify examinees based on their item response patterns. The model has a simple framework that starts with the conditional probabilities of examinees in each category or mastery state responding correctly to each item. The presented evaluation investigates: (1) the…

Descriptors: Classification, Scoring, Item Response Theory, Measurement

Toward DSM-V: An Item Response Theory Analysis of the Diagnostic Process for DSM-IV Alcohol Abuse and Dependence in Adolescents

Peer reviewed

Direct link

Gelhorn, Heather; Hartman, Christie; Sakai, Joseph; Stallings, Michael; Young, Susan; Rhee, So Hyun; Corley, Robin; Hewitt, John; Hopger, Christian; Crowley, Thomas D. – Journal of the American Academy of Child & Adolescent Psychiatry, 2008

Clinical interviews of approximately 5,587 adolescents revealed that DSM-IV diagnostic categories were found to be different in terms of the severity of alcohol use disorders (AUDs). However, a substantial inconsistency and overlap was found in severity of AUDs across categories. The need for an alternative diagnostic algorithm which considers all…

Descriptors: Alcohol Abuse, Drinking, Disability Identification, Adolescents

A Method for Estimating Classification Consistency Indices for Two Equated Forms

Peer reviewed

Direct link

Yi, Hyun Sook; Kim, Seonghoon; Brennan, Robert L. – Applied Psychological Measurement, 2007

Large-scale testing programs involving classification decisions typically have multiple forms available and conduct equating to ensure cut-score comparability across forms. A test developer might be interested in the extent to which an examinee who happens to take a particular form would have a consistent classification decision if he or she had…

Descriptors: Classification, Reliability, Indexes, Computation

Understanding Rasch Measurement: Optimizing Rating Scale Category Effectiveness.

Peer reviewed

Linacre, John M. – Journal of Applied Measurement, 2002

Suggests eight guidelines to aid the analyst in optimizing the manner in which rating scale categories cooperate to improve the usefulness of the resultant measures. Presents these guidelines in the context of Rasch analysis, but notes that they reflect aspects of rating scale functioning that impact all methods of analysis. (SLD)

Descriptors: Classification, Item Response Theory, Rating Scales

Detecting Differential Rater Functioning over Time (DRIFT) Using a Rasch Multi-Faceted Rating Scale Model.

Download full text

Wolfe, Edward W.; Moulder, Bradley C.; Myford, Carol M. – 1999

This paper describes a class of rater effects that depict rater-by-time interactions. This class of rater effects is referred to as differential rater functioning over time (DRIFT). This article describes several types of DRIFT (primacy/recency, differential centrality/extremism, and practice/fatigue) and Rasch measurement procedures designed to…

Descriptors: Classification, Effect Size, Evaluators, Item Response Theory

Previous Page | Next Page »

Pages: 1 | 2

Linacre, John M.	3
Rudner, Lawrence M.	2
Andersen, Nico	1
Asilkalkan, Abdullah	1
Bengs, Daniel	1
Brennan, Robert L.	1
Chiu, Chia-Yi	1
Choi, Youn-Jeng	1
Corley, Robin	1
Crowley, Thomas D.	1
D'Zurilla, Thomas J.	1
De Boeck, Paul	1
Deribo, Tobias	1
Diao, Hongyu	1
Eichmann, Beate	1
Gelhorn, Heather	1
Hahnel, Carolin	1
Harris, Deborah J.	1
Harrison, Scott	1
Hartman, Christie	1
Hewitt, John	1
Hoffman, Lesa	1
Hopger, Christian	1
Huynh, Huynh	1
Janssen, Rianne	1
More ▼