ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	8
Since 2016 (last 10 years)	19
Since 2006 (last 20 years)	37

Descriptor

Classification	40
Comparative Analysis	40
Item Response Theory	40
Test Items	18
Models	14
Accuracy	13
Foreign Countries	11
Item Analysis	9
Correlation	8
Simulation	8
Psychometrics	7
Reliability	7
Scores	7
Goodness of Fit	6
Statistical Analysis	6
Computer Assisted Testing	5
Diagnostic Tests	5
English (Second Language)	5
Factor Analysis	5
Language Tests	5
Mathematics Tests	5
Achievement Tests	4
Cutting Scores	4
Difficulty Level	4
Elementary Secondary Education	4
More ▼

Publication Type

Journal Articles	31
Reports - Research	25
Reports - Evaluative	6
Dissertations/Theses -…	5
Collected Works - Proceedings	1
Non-Print Media	1
Opinion Papers	1
Reference Materials - General	1
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

Higher Education	7
Postsecondary Education	7
Elementary Education	3
Elementary Secondary Education	3
Secondary Education	3
Intermediate Grades	2
Grade 4	1
Grade 6	1
High Schools	1
Junior High Schools	1
Middle Schools	1
More ▼

Audience

Location

China	2
Germany	2
Bermuda	1
Canada	1
Europe	1
Finland	1
France	1
Italy	1
Norway	1
Switzerland	1
Turkey (Ankara)	1
United Kingdom (England)	1
United States	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Trends in International…	3
ACT Assessment	2
Program for International…	2
Progress in International…	2
SAT (College Admission Test)	1
United States Medical…	1
Work Keys (ACT)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 40 results Save | Export

Impact of Multidimensionality on Unidimensional IRT Linking and Equating Methods

Direct link

Uk Hyun Cho – ProQuest LLC, 2024

The present study investigates the influence of multidimensionality on linking and equating in a unidimensional IRT. Two hypothetical multidimensional scenarios are explored under a nonequivalent group common-item equating design. The first scenario examines test forms designed to measure multiple constructs, while the second scenario examines a…

Descriptors: Item Response Theory, Classification, Correlation, Test Format

An Evaluation of Fit Indices Used in Model Selection of Dichotomous Mixture IRT Models

Peer reviewed

Direct link

Sedat Sen; Allan S. Cohen – Educational and Psychological Measurement, 2024

A Monte Carlo simulation study was conducted to compare fit indices used for detecting the correct latent class in three dichotomous mixture item response theory (IRT) models. Ten indices were considered: Akaike's information criterion (AIC), the corrected AIC (AICc), Bayesian information criterion (BIC), consistent AIC (CAIC), Draper's…

Descriptors: Goodness of Fit, Item Response Theory, Sample Size, Classification

Reliability and Validity Evidence of Diagnostic Methods: Comparison of Diagnostic Classification Models and Item Response Theory-Based Methods

Direct link

Yoo Jeong Jang – ProQuest LLC, 2022

Despite the increasing demand for diagnostic information, observed subscores have been often reported to lack adequate psychometric qualities such as reliability, distinctiveness, and validity. Therefore, several statistical techniques based on CTT and IRT frameworks have been proposed to improve the quality of subscores. More recently, DCM has…

Descriptors: Classification, Accuracy, Item Response Theory, Correlation

Diagnostic Classification Model for Forced-Choice Items and Noncognitive Tests

Peer reviewed

Direct link

Huang, Hung-Yu – Educational and Psychological Measurement, 2023

The forced-choice (FC) item formats used for noncognitive tests typically develop a set of response options that measure different traits and instruct respondents to make judgments among these options in terms of their preference to control the response biases that are commonly observed in normative tests. Diagnostic classification models (DCMs)…

Descriptors: Test Items, Classification, Bayesian Statistics, Decision Making

Impact of DIF on General Factor Mean Comparisons for Bifactor, Ordinal Data

Peer reviewed

Direct link

Liu, Yixing; Thompson, Marilyn S. – Journal of Experimental Education, 2022

A simulation study was conducted to explore the impact of differential item functioning (DIF) on general factor difference estimation for bifactor, ordinal data. Common analysis misspecifications in which the generated bifactor data with DIF were fitted using models with equality constraints on noninvariant item parameters were compared under data…

Descriptors: Comparative Analysis, Item Analysis, Sample Size, Error of Measurement

Comparison of Cognitive Diagnosis Models under Changing Conditions: DINA, RDINA, HODINA and HORDINA

Peer reviewed
PDF on ERIC

Download full text

Kalkan, Ömür K.; Kelecioglu, Hülya; Basokçu, Tahsin O. – International Education Studies, 2018

The application of CDMs to fraction subtraction data revealed problems on the classification of examinees, latent class sizes, and the use of higher-order models. Additionally, selecting the most appropriate model assumes critical importance if there are several appropriate models available for the data. In the present study, DINA-RDINA and…

Descriptors: Comparative Analysis, Models, Item Response Theory, Multivariate Analysis

IRT Approaches to Modeling Scores on Mixed-Format Tests

Peer reviewed

Direct link

Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020

This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…

Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests

Proficiency Classification and Violated Local Independence: An Examination of Pass/Fail Decision Accuracy under Competing Rasch Models

Peer reviewed

Direct link

Hodge, Kari J.; Morgan, Grant B. – Journal of Applied Testing Technology, 2020

The purpose of this study was to examine the use of a misspecified calibration model and its impact on proficiency classification. Monte Carlo simulation methods were employed to compare competing models when the true structure of the data is known (i.e., testlet conditions). The conditions used in the design (e.g., number of items, testlet to…

Descriptors: Item Response Theory, Accuracy, Decision Making, Classification

Reliably Assessing Growth with Longitudinal Diagnostic Classification Models

Peer reviewed

Direct link

Madison, Matthew J. – Educational Measurement: Issues and Practice, 2019

Recent advances have enabled diagnostic classification models (DCMs) to accommodate longitudinal data. These longitudinal DCMs were developed to study how examinees change, or transition, between different attribute mastery statuses over time. This study examines using longitudinal DCMs as an approach to assessing growth and serves three purposes:…

Descriptors: Longitudinal Studies, Item Response Theory, Psychometrics, Criterion Referenced Tests

Scoring Graphical Responses in TIMSS 2019 Using Artificial Neural Networks

Peer reviewed

Direct link

von Davier, Matthias; Tyack, Lillian; Khorramdel, Lale – Educational and Psychological Measurement, 2023

Automated scoring of free drawings or images as responses has yet to be used in large-scale assessments of student achievement. In this study, we propose artificial neural networks to classify these types of graphical responses from a TIMSS 2019 item. We are comparing classification accuracy of convolutional and feed-forward approaches. Our…

Descriptors: Scoring, Networks, Artificial Intelligence, Elementary Secondary Education

Application of Bi-Factor MIRT and Higher-Order CDM Models to an In-House EFL Listening Test for Diagnostic Purposes

Peer reviewed

Direct link

Min, Shangchao; Cai, Hongwen; He, Lianzhen – Language Assessment Quarterly, 2022

The present study examined the performance of the bi-factor multidimensional item response theory (MIRT) model and higher-order (HO) cognitive diagnostic models (CDM) in providing diagnostic information and general ability estimation simultaneously in a listening test. The data used were 1,611 examinees' item-level responses to an in-house EFL…

Descriptors: Listening Comprehension Tests, English (Second Language), Second Language Learning, Foreign Countries

Adjacent-Categories Mokken Models for Rater-Mediated Assessments

Peer reviewed

Direct link

Wind, Stefanie A. – Educational and Psychological Measurement, 2017

Molenaar extended Mokken's original probabilistic-nonparametric scaling models for use with polytomous data. These polytomous extensions of Mokken's original scaling procedure have facilitated the use of Mokken scale analysis as an approach to exploring fundamental measurement properties across a variety of domains in which polytomous ratings are…

Descriptors: Nonparametric Statistics, Scaling, Models, Item Response Theory

A Practical Comparison of Selected Methods of Evaluating Multiple-Choice Options through Classical Item Analysis

Peer reviewed
PDF on ERIC

Download full text

Malec, Wojciech; Krzeminska-Adamek, Malgorzata – Practical Assessment, Research & Evaluation, 2020

The main objective of the article is to compare several methods of evaluating multiple-choice options through classical item analysis. The methods subjected to examination include the tabulation of choice distribution, the interpretation of trace lines, the point-biserial correlation, the categorical analysis of trace lines, and the investigation…

Descriptors: Comparative Analysis, Evaluation Methods, Multiple Choice Tests, Item Analysis

Does Matching Quality Matter in Mode Comparison Studies?

Peer reviewed

Direct link

Zeng, Ji; Yin, Ping; Shedden, Kerby A. – Educational and Psychological Measurement, 2015

This article provides a brief overview and comparison of three matching approaches in forming comparable groups for a study comparing test administration modes (i.e., computer-based tests [CBT] and paper-and-pencil tests [PPT]): (a) a propensity score matching approach proposed in this article, (b) the propensity score matching approach used by…

Descriptors: Comparative Analysis, Computer Assisted Testing, Probability, Classification

IRT-Based Classification Analysis of an English Language Reading Proficiency Subtest

Peer reviewed

Direct link

Kaya, Elif; O'Grady, Stefan; Kalender, Ilker – Language Testing, 2022

Language proficiency testing serves an important function of classifying examinees into different categories of ability. However, misclassification is to some extent inevitable and may have important consequences for stakeholders. Recent research suggests that classification efficacy may be enhanced substantially using computerized adaptive…

Descriptors: Item Response Theory, Test Items, Language Tests, Classification

Previous Page | Next Page »

Pages: 1 | 2 | 3

Educational and Psychological…	9
ProQuest LLC	5
Journal of Educational…	2
Language Testing	2
Measurement:…	2
Practical Assessment,…	2
Applied Measurement in…	1
Applied Psychological…	1
College Board	1
Current Issues in Comparative…	1
Educational Measurement:…	1
Educational Research and…	1
International Education…	1
International Educational…	1
Journal of Applied Testing…	1
Journal of Experimental…	1
Journal of Psychoeducational…	1
Language Assessment Quarterly	1
Language Learning in Higher…	1
Large-scale Assessments in…	1
Research Papers in Education	1
Structural Equation Modeling:…	1
More ▼

Lee, Won-Chan	3
von Davier, Matthias	2
Allan S. Cohen	1
Anwyll, Steve	1
Baldwin, Peter	1
Basokçu, Tahsin O.	1
Brennan, Robert L.	1
Cai, Hongwen	1
Cai, Yuyang	1
Ceder, Ineke	1
Chang, Hua-Hua	1
Charmaraman, Linda	1
Chen, Pei-Hua	1
Choi, Jiwon	1
Clauser, Jerome C.	1
Deng, Nina	1
Eckes, Thomas	1
Eggen, Theo J. H. M.	1
Erkut, Sumru	1
Garcia, Heidie Vazquez	1
Glanville, Matthew	1
Gorin, Joanna S.	1
Grossman, Jennifer M.	1
Haiying Yuan	1
More ▼