ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	5

Descriptor

Educational Testing	8
Evaluation Methods	8
Educational Assessment	4
Evaluation Research	4
Measurement	4
Student Evaluation	4
Evaluation Problems	3
Psychometrics	3
Simulation	3
Testing Problems	3
Evaluation Criteria	2
Models	2
Scoring	2
Standardized Tests	2
Accountability	1
Admission (School)	1
Advanced Placement	1
Advanced Placement Programs	1
Cognitive Tests	1
College Students	1
Communication (Thought…	1
Correlation	1
Data	1
Data Analysis	1
Diagnostic Tests	1
More ▼

Source

Journal of Educational…

Author

Armstrong, Ronald D.	1
Baldwin, Su G.	1
Biggs, J. B.	1
Braun, P. H.	1
Clauser, Brian E.	1
Cui, Ying	1
Dillon, Gerard F.	1
Leighton, Jacqueline P.	1
Madaus, George F.	1
Margolis, Melissa J.	1
Mee, Janet	1
Myford, Carol M.	1
Page, Ellis B.	1
Rippey, Robert M.	1
Shi, Min	1
Sinharay, Sandip	1
Wolfe, Edward W.	1
More ▼

Publication Type

Journal Articles	5
Reports - Research	3
Reports - Evaluative	2

Education Level

Elementary Secondary Education	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…	1
Sequential Tests of…	1

What Works Clearinghouse Rating

Showing all 8 results Save | Export

Measuring the Uncertainty of Imputed Scores

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational Measurement, 2023

Technical difficulties and other unforeseen events occasionally lead to incomplete data on educational tests, which necessitates the reporting of imputed scores to some examinees. While there exist several approaches for reporting imputed scores, there is a lack of any guidance on the reporting of the uncertainty of imputed scores. In this paper,…

Descriptors: Evaluation Methods, Scores, Standardized Tests, Simulation

Model-Free CUSUM Methods for Person Fit

Peer reviewed

Direct link

Armstrong, Ronald D.; Shi, Min – Journal of Educational Measurement, 2009

This article demonstrates the use of a new class of model-free cumulative sum (CUSUM) statistics to detect person fit given the responses to a linear test. The fundamental statistic being accumulated is the likelihood ratio of two probabilities. The detection performance of this CUSUM scheme is compared to other model-free person-fit statistics…

Descriptors: Probability, Simulation, Models, Psychometrics

Monitoring Rater Performance over Time: A Framework for Detecting Differential Accuracy and Differential Scale Category Use

Peer reviewed

Direct link

Myford, Carol M.; Wolfe, Edward W. – Journal of Educational Measurement, 2009

In this study, we describe a framework for monitoring rater performance over time. We present several statistical indices to identify raters whose standards drift and explain how to use those indices operationally. To illustrate the use of the framework, we analyzed rating data from the 2002 Advanced Placement English Literature and Composition…

Descriptors: English Literature, Advanced Placement, Measures (Individuals), Writing (Composition)

Judges' Use of Examinee Performance Data in an Angoff Standard-Setting Exercise for a Medical Licensing Examination: An Experimental Study

Peer reviewed

Direct link

Clauser, Brian E.; Mee, Janet; Baldwin, Su G.; Margolis, Melissa J.; Dillon, Gerard F. – Journal of Educational Measurement, 2009

Although the Angoff procedure is among the most widely used standard setting procedures for tests comprising multiple-choice items, research has shown that subject matter experts have considerable difficulty accurately making the required judgments in the absence of examinee performance data. Some authors have viewed the need to provide…

Descriptors: Standard Setting (Scoring), Program Effectiveness, Expertise, Health Personnel

The Hierarchy Consistency Index: Evaluating Person Fit for Cognitive Diagnostic Assessment

Peer reviewed

Direct link

Cui, Ying; Leighton, Jacqueline P. – Journal of Educational Measurement, 2009

In this article, we introduce a person-fit statistic called the hierarchy consistency index (HCI) to help detect misfitting item response vectors for tests developed and analyzed based on a cognitive model. The HCI ranges from -1.0 to 1.0, with values close to -1.0 indicating that students respond unexpectedly or differently from the responses…

Descriptors: Test Length, Simulation, Correlation, Research Methodology

Seeking a Measure of General Educational Advancement: The Bentee

Peer reviewed

Page, Ellis B. – Journal of Educational Measurement, 1972

This paper discusses the underlying structure and development of the benefit-T-score (bentee) which attempts to measure overall educational benefit. (CK)

Descriptors: Accountability, Educational Benefits, Educational Diagnosis, Educational Testing

Models of Evaluation and Their Relation to Student Characteristics

Peer reviewed

Biggs, J. B.; Braun, P. H. – Journal of Educational Measurement, 1972

The union and disjunction models for combining individual test marks to yield final grade distributions are outlined and examined empirically in two educational psychology classes. (Authors/CB)

Descriptors: College Students, Educational Testing, Evaluation Methods, Factor Analysis

Zeroing in on the STEP Writing Test: What Does It Tell a Teacher?

Peer reviewed
PDF on ERIC

Download full text

Madaus, George F.; Rippey, Robert M. – Journal of Educational Measurement, 1966

The validity of the multiple-choice Sequential Tests of Educational Progress (STEP) Writing Test (1957) was tested by the University of Chicago Center for the Cooperative Study of Instruction. Seven criteria developed by the center to score essay assignments were used to determine the relationship between STEP and actual writing behavior. Of the…

Descriptors: Communication (Thought Transfer), Educational Testing, English Instruction, Evaluation Criteria