NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 7 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Lewis, Jennifer; Lim, Hwanggyu; Padellaro, Frank; Sireci, Stephen G.; Zenisky, April L. – Educational Measurement: Issues and Practice, 2022
Setting cut scores on (MSTs) is difficult, particularly when the test spans several grade levels, and the selection of items from MST panels must reflect the operational test specifications. In this study, we describe, illustrate, and evaluate three methods for mapping panelists' Angoff ratings into cut scores on the scale underlying an MST. The…
Descriptors: Cutting Scores, Adaptive Testing, Test Items, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Skaggs, Gary; Hein, Serge F.; Wilkins, Jesse L. M. – Educational Measurement: Issues and Practice, 2020
In test-centered standard-setting methods, borderline performance can be represented by many different profiles of strengths and weaknesses. As a result, asking panelists to estimate item or test performance for a hypothetical group study of borderline examinees, or a typical borderline examinee, may be an extremely difficult task and one that can…
Descriptors: Standard Setting (Scoring), Cutting Scores, Testing Problems, Profiles
Peer reviewed Peer reviewed
Direct linkDirect link
Lewis, Daniel; Cook, Robert – Educational Measurement: Issues and Practice, 2020
In this paper we assert that the practice of principled assessment design renders traditional standard-setting methodology redundant at best and contradictory at worst. We describe the rationale for, and methodological details of, Embedded Standard Setting (ESS; previously, Engineered Cut Scores. Lewis, 2016), an approach to establish performance…
Descriptors: Standard Setting, Evaluation, Cutting Scores, Performance Based Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E. – Educational Measurement: Issues and Practice, 2017
This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test…
Descriptors: Cutting Scores, Item Response Theory, Bayesian Statistics, Maximum Likelihood Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Margolis, Melissa J.; Mee, Janet; Clauser, Brian E.; Winward, Marcia; Clauser, Jerome C. – Educational Measurement: Issues and Practice, 2016
Evidence to support the credibility of standard setting procedures is a critical part of the validity argument for decisions made based on tests that are used for classification. One area in which there has been limited empirical study is the impact of standard setting judge selection on the resulting cut score. One important issue related to…
Descriptors: Academic Standards, Standard Setting (Scoring), Cutting Scores, Credibility
Peer reviewed Peer reviewed
Direct linkDirect link
Tiffin-Richards, Simon P.; Pant, Hans Anand; Koller, Olaf – Educational Measurement: Issues and Practice, 2013
Cut-scores were set by expert judges on assessments of reading and listening comprehension of English as a foreign language (EFL), using the bookmark standard-setting method to differentiate proficiency levels defined by the Common European Framework of Reference (CEFR). Assessments contained stratified item samples drawn from extensive item…
Descriptors: Foreign Countries, English (Second Language), Language Tests, Standard Setting (Scoring)
Peer reviewed Peer reviewed
Direct linkDirect link
Ferrara, Steve; Svetina, Dubravka; Skucha, Sylvia; Davidson, Anne H. – Educational Measurement: Issues and Practice, 2011
Items on test score scales located at and below the Proficient cut score define the content area knowledge and skills required to achieve proficiency. Alternately, examinees who perform at the Proficient level on a test can be expected to be able to demonstrate that they have mastered most of the knowledge and skills represented by the items at…
Descriptors: Knowledge Level, Mathematics Tests, Program Effectiveness, Inferences