ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	0
Since 2007 (last 20 years)	5

Descriptor

Evaluation Methods	7
Testing Problems	7
Evaluation Problems	4
Evaluation Research	4
Measurement	4
Educational Assessment	3
Educational Testing	3
Psychometrics	3
Test Items	3
Cognitive Tests	2
Diagnostic Tests	2
Item Response Theory	2
Licensing Examinations…	2
Measures (Individuals)	2
Scoring	2
Simulation	2
Accounting	1
Adaptive Testing	1
Advanced Placement	1
Advanced Placement Programs	1
Computer Assisted Testing	1
Correlation	1
Data	1
Data Analysis	1
English	1
More ▼

Source

Journal of Educational…

Author

Askegaard, Lewis D.	1
Baldwin, Su G.	1
Breithaupt, Krista	1
Chuah, Siang Chee	1
Clauser, Brian E.	1
Cui, Ying	1
Dillon, Gerard F.	1
Izard, John	1
Karelitz, Tzur M.	1
Leighton, Jacqueline P.	1
Margolis, Melissa J.	1
Mee, Janet	1
Myford, Carol M.	1
Umila, Benwardo V.	1
Wolfe, Edward W.	1
Zhang, Yanwei	1
de La Torre, Jimmy	1
van der Linden, Wim J.	1
More ▼

Publication Type

Journal Articles	7
Reports - Evaluative	3
Reports - Research	3
Book/Product Reviews	1

Education Level

Elementary Secondary Education	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…

What Works Clearinghouse Rating

Showing all 7 results Save | Export

Impact of Diagnosticity on the Adequacy of Models for Cognitive Diagnosis under a Linear Attribute Structure: A Simulation Study

Peer reviewed

Direct link

de La Torre, Jimmy; Karelitz, Tzur M. – Journal of Educational Measurement, 2009

Compared to unidimensional item response models (IRMs), cognitive diagnostic models (CDMs) based on latent classes represent examinees' knowledge and item requirements using discrete structures. This study systematically examines the viability of retrofitting CDMs to IRM-based data with a linear attribute structure. The study utilizes a procedure…

Descriptors: Simulation, Item Response Theory, Psychometrics, Evaluation Methods

Monitoring Rater Performance over Time: A Framework for Detecting Differential Accuracy and Differential Scale Category Use

Peer reviewed

Direct link

Myford, Carol M.; Wolfe, Edward W. – Journal of Educational Measurement, 2009

In this study, we describe a framework for monitoring rater performance over time. We present several statistical indices to identify raters whose standards drift and explain how to use those indices operationally. To illustrate the use of the framework, we analyzed rating data from the 2002 Advanced Placement English Literature and Composition…

Descriptors: English Literature, Advanced Placement, Measures (Individuals), Writing (Composition)

Judges' Use of Examinee Performance Data in an Angoff Standard-Setting Exercise for a Medical Licensing Examination: An Experimental Study

Peer reviewed

Direct link

Clauser, Brian E.; Mee, Janet; Baldwin, Su G.; Margolis, Melissa J.; Dillon, Gerard F. – Journal of Educational Measurement, 2009

Although the Angoff procedure is among the most widely used standard setting procedures for tests comprising multiple-choice items, research has shown that subject matter experts have considerable difficulty accurately making the required judgments in the absence of examinee performance data. Some authors have viewed the need to provide…

Descriptors: Standard Setting (Scoring), Program Effectiveness, Expertise, Health Personnel

The Hierarchy Consistency Index: Evaluating Person Fit for Cognitive Diagnostic Assessment

Peer reviewed

Direct link

Cui, Ying; Leighton, Jacqueline P. – Journal of Educational Measurement, 2009

In this article, we introduce a person-fit statistic called the hierarchy consistency index (HCI) to help detect misfitting item response vectors for tests developed and analyzed based on a cognitive model. The HCI ranges from -1.0 to 1.0, with values close to -1.0 indicating that students respond unexpectedly or differently from the responses…

Descriptors: Test Length, Simulation, Correlation, Research Methodology

Detecting Differential Speededness in Multistage Testing

Peer reviewed

Direct link

van der Linden, Wim J.; Breithaupt, Krista; Chuah, Siang Chee; Zhang, Yanwei – Journal of Educational Measurement, 2007

A potential undesirable effect of multistage testing is differential speededness, which happens if some of the test takers run out of time because they receive subtests with items that are more time intensive than others. This article shows how a probabilistic response-time model can be used for estimating differences in time intensities and speed…

Descriptors: Adaptive Testing, Evaluation Methods, Test Items, Reaction Time

"Tests in Print IV: An Index to Tests, Test Reviews, and the Literature on Specific Tests." Book Review.

Peer reviewed

Izard, John – Journal of Educational Measurement, 1995

"Tests in Print IV" is a two-volume comprehensive master index of commercially published separate test titles in print that are available for purchase and use. There are over 3,000 entries, the majority of which originated in the United States. It is an invaluable resource for English-speaking test users. (SLD)

Descriptors: English, Evaluation Methods, Literature Reviews, Measurement Techniques

An Empirical Investigation of the Applicability of Multiple Matrix Sampling to the Method of Rank Order.

Peer reviewed

Askegaard, Lewis D.; Umila, Benwardo V. – Journal of Educational Measurement, 1982

Multiple matrix sampling of items and examinees was applied to an 18-item rank order instrument administered to a randomly assigned group and compared to the ordering and ranking of all items by control subjects. High correlations between ranks suggest the methodology may viably reduce respondent effort on long rank ordering tasks. (Author/CM)

Descriptors: Evaluation Methods, Item Sampling, Junior High Schools, Student Reaction