ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	19

Descriptor

Classification	30
Test Theory	30
Measurement Techniques	11
Psychometrics	10
Evaluation Methods	9
Test Items	9
Foreign Countries	8
Definitions	7
Models	7
Comparative Analysis	6
Educational Assessment	5
Equated Scores	5
Mastery Tests	5
Reliability	5
Test Reliability	5
Test Use	5
Testing Problems	5
Cutting Scores	4
Decision Making	4
Diagnostic Tests	4
Evaluation Problems	4
Item Response Theory	4
Measurement	4
Predictive Measurement	4
Statistical Analysis	4
More ▼

Source

Measurement:…	7
Journal of Educational…	3
International Journal of…	2
Journal of Educational and…	2
Applied Measurement in…	1
Applied Psychological…	1
Assessment in Education:…	1
ETS Research Report Series	1
Educational Measurement:…	1
Gifted Child Quarterly	1
Journal of Applied Testing…	1
Journal of School Psychology	1
ProQuest LLC	1
Psychometrika	1
Research Papers in Education	1
More ▼

Publication Type

Journal Articles	24
Reports - Research	10
Reports - Evaluative	9
Opinion Papers	6
Speeches/Meeting Papers	5
Reports - Descriptive	3
Dissertations/Theses -…	1
Numerical/Quantitative Data	1

Education Level

Elementary Secondary Education	4
Elementary Education	2
Secondary Education	2
Grade 4	1
Grade 5	1
Grade 6	1
Grade 8	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1

Audience

Practitioners	1
Researchers	1

Location

Netherlands	2
United Kingdom	2
United Kingdom (England)	2
United States	2
Australia	1
Taiwan	1
United Kingdom (Wales)	1

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…	1
SAT (College Admission Test)	1
Woodcock Johnson Tests of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 30 results Save | Export

Identifying Enemy Item Pairs Using Natural Language Processing

Peer reviewed

Direct link

Becker, Kirk A.; Kao, Shu-chuan – Journal of Applied Testing Technology, 2022

Natural Language Processing (NLP) offers methods for understanding and quantifying the similarity between written documents. Within the testing industry these methods have been used for automatic item generation, automated scoring of text and speech, modeling item characteristics, automatic question answering, machine translation, and automated…

Descriptors: Item Banks, Natural Language Processing, Computer Assisted Testing, Scoring

Examining Psychometric Properties and Level Classification of the van Hiele Geometry Test Using CTT and CDM Frameworks

Peer reviewed

Direct link

Chen, Yi-Hsin; Senk, Sharon L.; Thompson, Denisse R.; Voogt, Kevin – Journal of Educational Measurement, 2019

The van Hiele theory and van Hiele Geometry Test have been extensively used in mathematics assessments across countries. The purpose of this study is to use classical test theory (CTT) and cognitive diagnostic modeling (CDM) frameworks to examine psychometric properties of the van Hiele Geometry Test and to compare how various classification…

Descriptors: Geometry, Mathematics Tests, Test Theory, Psychometrics

A Cognitive Diagnosis Model for Continuous Response

Peer reviewed

Direct link

Minchen, Nathan D.; de la Torre, Jimmy; Liu, Ying – Journal of Educational and Behavioral Statistics, 2017

Nondichotomous response models have been of greater interest in recent years due to the increasing use of different scoring methods and various performance measures. As an important alternative to dichotomous scoring, the use of continuous response formats has been found in the literature. To assess finer-grained skills or attributes and to…

Descriptors: Models, Psychometrics, Test Theory, Maximum Likelihood Statistics

Differentiating among High-Achieving Learners: A Comparison of Classical Test Theory and Item Response Theory on Above-Level Testing

Direct link

LeBeau, Brandon; Assouline, Susan G.; Mahatmya, Duhita; Lupkowski-Shoplik, Ann – Gifted Child Quarterly, 2020

This study investigated the application of item response theory (IRT) to expand the range of ability estimates for gifted (hereinafter referred to as high-achieving) students' performance on an above-level test. Using a sample of fourth- to sixth-grade high-achieving students (N = 1,893), we conducted a study to compare estimates from two…

Descriptors: Item Response Theory, Test Theory, Academically Gifted, High Achievement

Analysis of Added Value of Subscores with Respect to Classification

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational Measurement, 2014

Brennan noted that users of test scores often want (indeed, demand) that subscores be reported, along with total test scores, for diagnostic purposes. Haberman suggested a method based on classical test theory (CTT) to determine if subscores have added value over the total score. One way to interpret the method is that a subscore has added value…

Descriptors: Scores, Test Theory, Classification, Cutting Scores

Screening Test Items for Differential Item Functioning

Peer reviewed

Direct link

Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014

A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…

Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing

Classification Accuracy in Key Stage 2 National Curriculum Tests in England

Peer reviewed

Direct link

He, Qingping; Hayes, Malcolm; Wiliam, Dylan – Research Papers in Education, 2013

The accuracy of the results of the national tests in English, mathematics and science taken by 11-year olds in England has been a matter of much debate since their introduction in 1994, with estimates of the proportion of students incorrectly classified varying from 10 to 30%. Using live data from the 2009 and 2010 administration of the national…

Descriptors: Foreign Countries, National Curriculum, Accuracy, Classification

Evaluating IRT- and CTT-Based Methods of Estimating Classification Consistency and Accuracy Indices from Single Administrations

Direct link

Deng, Nina – ProQuest LLC, 2011

Three decision consistency and accuracy (DC/DA) methods, the Livingston and Lewis (LL) method, LEE method, and the Hambleton and Han (HH) method, were evaluated. The purposes of the study were: (1) to evaluate the accuracy and robustness of these methods, especially when their assumptions were not well satisfied, (2) to investigate the "true"…

Descriptors: Item Response Theory, Test Theory, Computation, Classification

Correcting Fallacies in Validity, Reliability, and Classification

Peer reviewed

Direct link

Sijtsma, Klaas – International Journal of Testing, 2009

This article reviews three topics from test theory that continue to raise discussion and controversy and capture test theorists' and constructors' interest. The first topic concerns the discussion of the methodology of investigating and establishing construct validity; the second topic concerns reliability and its misuse, alternative definitions…

Descriptors: Construct Validity, Reliability, Classification, Test Theory

Some Notes on the Reinvention of Latent Structure Models as Diagnostic Classification Models

Peer reviewed

Direct link

von Davier, Matthias – Measurement: Interdisciplinary Research and Perspectives, 2009

In this commentary, the author points out few issues, one being that there are models mislabeled as diagnostic, which deal with linear decompositions of item difficulties rather than estimating multidimensional skill variables. The author discusses the issue that there are many new names for essentially well-known models for multiple simultaneous…

Descriptors: Test Items, Probability, Models, Diagnostic Tests

Educational Measurement Issues and Implications of High Stakes Decision Making in Final Examinations in Secondary Education in the Netherlands

Peer reviewed

Direct link

van Rijn, P. W.; Beguin, A. A.; Verstralen, H. H. F. M. – Assessment in Education: Principles, Policy & Practice, 2012

While measurement precision is relatively easy to establish for single tests and assessments, it is much more difficult to determine for decision making with multiple tests on different subjects. This latter is the situation in the system of final examinations for secondary education in the Netherlands and is used as an example in this paper. This…

Descriptors: Secondary Education, Tests, Foreign Countries, Decision Making

Diagnostic Classification Modeling: Opportunity for Identity

Peer reviewed

Direct link

Hancock, Gregory R. – Measurement: Interdisciplinary Research and Perspectives, 2009

As Rupp and Templin (2008) stated directly, diagnostic classification methods "are confirmatory in nature." Methods, though, are neither inherently confirmatory nor exploratory. Diagnostic classification modeling, with its analytical and computational obstacles eventually yielding as a comprehensive and potent discipline emerges, will…

Descriptors: Structural Equation Models, Test Items, Models, Diagnostic Tests

Have Cognitive Diagnostic Models Delivered Their Goods? Some Substantial and Methodological Concerns

Peer reviewed

Direct link

Wilhelm, Oliver; Robitzsch, Alexander – Measurement: Interdisciplinary Research and Perspectives, 2009

The paper by Rupp and Templin (2008) is an excellent work on the characteristics and features of cognitive diagnostic models (CDM). In this article, the authors comment on some substantial and methodological aspects of this focus paper. They organize their comments by going through issues associated with the terms "cognitive,"…

Descriptors: Research Methodology, Test Items, Models, Diagnostic Tests

Defending the Quality of Links between Scores from Different Tests and Exams

Peer reviewed

Direct link

Cresswell, Mike – Measurement: Interdisciplinary Research and Perspectives, 2010

Paul Newton (2010), with his characteristic concern about theory, has set out two different ways of thinking about the basis upon which equivalences of one sort or another are established between test score scales. His reason for doing this is a desire to establish "the defensibility of linkages lower on the continuum than concordance."…

Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis

Diagnostic Classification Models: Which One Should I Use?

Peer reviewed

Direct link

Jiao, Hong – Measurement: Interdisciplinary Research and Perspectives, 2009

Diagnostic assessment is currently an active research area in educational measurement. Literature related to diagnostic modeling has been in existence for several decades, but a great deal of research has been conducted within the last decade or so, especially within the last five years. The author summarizes the key components in the application…

Descriptors: Educational Assessment, Literature Reviews, Test Items, Probability

Previous Page | Next Page »

Pages: 1 | 2

Chen, Yi-Hsin	2
van der Linden, Wim J.	2
Assouline, Susan G.	1
Becker, Kirk A.	1
Beguin, A. A.	1
Cresswell, Mike	1
Deng, Nina	1
Dennings, Bruce	1
Divgi, D. R.	1
Dorans, Neil J.	1
Downing, Steven M.	1
Gorin, Joanna S.	1
Haladyna, Thomas M.	1
Haladyna, Tom	1
Hancock, Gregory R.	1
Hayes, Malcolm	1
He, Qingping	1
Hoffman, R. Gene	1
Huynh, Huynh	1
Jiao, Hong	1
Kao, Shu-chuan	1
Kupermintz, Haggai	1
LeBeau, Brandon	1
Liu, Ying	1
More ▼