Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 9 |
Since 2006 (last 20 years) | 9 |
Descriptor
Source
Educational Measurement:… | 10 |
Author
Wyse, Adam E. | 3 |
Baldwin, Peter | 2 |
Babcock, Ben | 1 |
Clauser, Brian E. | 1 |
Cook, Robert | 1 |
Evans, Carla M. | 1 |
Jaeger, Richard M. | 1 |
Lewis, Daniel | 1 |
Lewis, Jennifer | 1 |
Lim, Hwanggyu | 1 |
Lyons, Susan | 1 |
More ▼ |
Publication Type
Journal Articles | 10 |
Reports - Research | 8 |
Reports - Evaluative | 2 |
Opinion Papers | 1 |
Education Level
Elementary Secondary Education | 1 |
Audience
Location
New Hampshire | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Peabody, Michael R.; Muckle, Timothy J.; Meng, Yu – Educational Measurement: Issues and Practice, 2023
The subjective aspect of standard-setting is often criticized, yet data-driven standard-setting methods are rarely applied. Therefore, we applied a mixture Rasch model approach to setting performance standards across several testing programs of various sizes and compared the results to existing passing standards derived from traditional…
Descriptors: Item Response Theory, Standard Setting, Testing, Sampling
Lewis, Jennifer; Lim, Hwanggyu; Padellaro, Frank; Sireci, Stephen G.; Zenisky, April L. – Educational Measurement: Issues and Practice, 2022
Setting cut scores on (MSTs) is difficult, particularly when the test spans several grade levels, and the selection of items from MST panels must reflect the operational test specifications. In this study, we describe, illustrate, and evaluate three methods for mapping panelists' Angoff ratings into cut scores on the scale underlying an MST. The…
Descriptors: Cutting Scores, Adaptive Testing, Test Items, Item Analysis
Baldwin, Peter – Educational Measurement: Issues and Practice, 2021
In the Bookmark standard-setting procedure, panelists are instructed to consider what examinees know rather than what they might attain by guessing; however, because examinees sometimes do guess, the procedure includes a correction for guessing. Like other corrections for guessing, the Bookmark's correction assumes that examinees either know the…
Descriptors: Guessing (Tests), Student Evaluation, Evaluation Methods, Standard Setting (Scoring)
Baldwin, Peter; Margolis, Melissa J.; Clauser, Brian E.; Mee, Janet; Winward, Marcia – Educational Measurement: Issues and Practice, 2020
Evidence of the internal consistency of standard-setting judgments is a critical part of the validity argument for tests used to make classification decisions. The bookmark standard-setting procedure is a popular approach to establishing performance standards, but there is relatively little research that reflects on the internal consistency of the…
Descriptors: Standard Setting (Scoring), Probability, Cutting Scores, Evaluation Methods
Wyse, Adam E. – Educational Measurement: Issues and Practice, 2020
One commonly used compromise standard-setting method is the Beuk (1984) method. A key assumption of the Beuk method is that the emphasis given to the pass rate and the percent correct ratings should be proportional to the extent that the panelists agree on their ratings. However, whether the slope of Beuk line reflects the emphasis that panelists…
Descriptors: Standard Setting (Scoring), Cutting Scores, Weighted Scores, Evaluation Methods
Lewis, Daniel; Cook, Robert – Educational Measurement: Issues and Practice, 2020
In this paper we assert that the practice of principled assessment design renders traditional standard-setting methodology redundant at best and contradictory at worst. We describe the rationale for, and methodological details of, Embedded Standard Setting (ESS; previously, Engineered Cut Scores. Lewis, 2016), an approach to establish performance…
Descriptors: Standard Setting, Evaluation, Cutting Scores, Performance Based Assessment
Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2017
This article provides an overview of the Hofstee standard-setting method and illustrates several situations where the Hofstee method will produce undefined cut scores. The situations where the cut scores will be undefined involve cases where the line segment derived from the Hofstee ratings does not intersect the score distribution curve based on…
Descriptors: Cutting Scores, Evaluation Methods, Standard Setting (Scoring), Comparative Analysis
Wyse, Adam E. – Educational Measurement: Issues and Practice, 2017
This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test…
Descriptors: Cutting Scores, Item Response Theory, Bayesian Statistics, Maximum Likelihood Statistics
Evans, Carla M.; Lyons, Susan – Educational Measurement: Issues and Practice, 2017
The purpose of this study was to test methods that strengthen the comparability claims about annual determinations of student proficiency in English language arts, math, and science (Grades 3-12) in the New Hampshire Performance Assessment of Competency Education (NH PACE) pilot project. First, we examined the literature in order to define…
Descriptors: Academic Achievement, Language Arts, Mathematics Achievement, Science Achievement

Jaeger, Richard M. – Educational Measurement: Issues and Practice, 1991
Issues concerning the selection of judges for standard setting are discussed. Determining the consistency of judges' recommendations, or their congruity with other expert recommendations, would help in selection. Enough judges must be chosen to allow estimation of recommendations by an entire population of judges. (SLD)
Descriptors: Cutting Scores, Evaluation Methods, Evaluators, Examiners