Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 4 |
Descriptor
Educational Assessment | 4 |
Educational Testing | 4 |
Evaluation Methods | 4 |
Evaluation Research | 4 |
Measurement | 4 |
Evaluation Problems | 3 |
Psychometrics | 3 |
Testing Problems | 3 |
Scoring | 2 |
Simulation | 2 |
Student Evaluation | 2 |
More ▼ |
Source
Journal of Educational… | 4 |
Author
Armstrong, Ronald D. | 1 |
Baldwin, Su G. | 1 |
Clauser, Brian E. | 1 |
Cui, Ying | 1 |
Dillon, Gerard F. | 1 |
Leighton, Jacqueline P. | 1 |
Margolis, Melissa J. | 1 |
Mee, Janet | 1 |
Myford, Carol M. | 1 |
Shi, Min | 1 |
Wolfe, Edward W. | 1 |
More ▼ |
Publication Type
Journal Articles | 4 |
Reports - Research | 3 |
Reports - Evaluative | 1 |
Education Level
Elementary Secondary Education | 1 |
Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Advanced Placement… | 1 |
What Works Clearinghouse Rating
Armstrong, Ronald D.; Shi, Min – Journal of Educational Measurement, 2009
This article demonstrates the use of a new class of model-free cumulative sum (CUSUM) statistics to detect person fit given the responses to a linear test. The fundamental statistic being accumulated is the likelihood ratio of two probabilities. The detection performance of this CUSUM scheme is compared to other model-free person-fit statistics…
Descriptors: Probability, Simulation, Models, Psychometrics
Myford, Carol M.; Wolfe, Edward W. – Journal of Educational Measurement, 2009
In this study, we describe a framework for monitoring rater performance over time. We present several statistical indices to identify raters whose standards drift and explain how to use those indices operationally. To illustrate the use of the framework, we analyzed rating data from the 2002 Advanced Placement English Literature and Composition…
Descriptors: English Literature, Advanced Placement, Measures (Individuals), Writing (Composition)
Clauser, Brian E.; Mee, Janet; Baldwin, Su G.; Margolis, Melissa J.; Dillon, Gerard F. – Journal of Educational Measurement, 2009
Although the Angoff procedure is among the most widely used standard setting procedures for tests comprising multiple-choice items, research has shown that subject matter experts have considerable difficulty accurately making the required judgments in the absence of examinee performance data. Some authors have viewed the need to provide…
Descriptors: Standard Setting (Scoring), Program Effectiveness, Expertise, Health Personnel
Cui, Ying; Leighton, Jacqueline P. – Journal of Educational Measurement, 2009
In this article, we introduce a person-fit statistic called the hierarchy consistency index (HCI) to help detect misfitting item response vectors for tests developed and analyzed based on a cognitive model. The HCI ranges from -1.0 to 1.0, with values close to -1.0 indicating that students respond unexpectedly or differently from the responses…
Descriptors: Test Length, Simulation, Correlation, Research Methodology