Publication Date
In 2025 | 2 |
Since 2024 | 4 |
Since 2021 (last 5 years) | 11 |
Since 2016 (last 10 years) | 50 |
Since 2006 (last 20 years) | 150 |
Descriptor
Standard Setting (Scoring) | 502 |
Cutting Scores | 228 |
Standards | 165 |
Elementary Secondary Education | 107 |
Test Items | 92 |
Evaluation Methods | 90 |
Academic Standards | 79 |
Scoring | 75 |
Minimum Competency Testing | 70 |
Licensing Examinations… | 66 |
Educational Assessment | 64 |
More ▼ |
Source
Author
Publication Type
Education Level
Location
Canada | 10 |
Australia | 8 |
Tennessee | 8 |
United Kingdom | 7 |
California | 4 |
Kansas | 4 |
Massachusetts | 4 |
New Jersey | 4 |
United States | 4 |
Illinois | 3 |
Michigan | 3 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Popham, W. James – Measurement: Interdisciplinary Research and Perspectives, 2013
The author recalls that as a child, he grooved on teeter-totters. Also known as a seesaw, a teeter-totter is a long, narrow board that's elevated with a pivot point in the middle so that as one end goes down the other end goes up. When going up or going down, sometimes quite rapidly, teeter-totters can provide their two riders with some…
Descriptors: Standard Setting (Scoring), Maps, Performance, Standards
Zapp, Mike; Helgetun, Jo B.; Powell, Justin J. W. – European Journal of Education, 2018
Educational research in Norway has experienced unprecedented structural expansion and cognitive shifts over the last two decades because of greater state investments and the strategic use of extensive and multi-year thematic programmes to fund research projects. Using a neo-institutionalist framework, we examine institutionalisation dynamics in…
Descriptors: Foreign Countries, Educational Research, Institutional Characteristics, Organizational Change
Torres Irribarra, David; Diakow, Ronli; Freund, Rebecca; Wilson, Mark – Grantee Submission, 2015
This paper presents the Latent Class Level-PCM as a method for identifying and interpreting latent classes of respondents according to empirically estimated performance levels. The model, which combines elements from latent class models and reparameterized partial credit models for polytomous data, can simultaneously (a) identify empirical…
Descriptors: Item Response Theory, Test Items, Statistical Analysis, Models
Grabovsky, Irina; Wainer, Howard – Journal of Educational and Behavioral Statistics, 2017
In this essay, we describe the construction and use of the Cut-Score Operating Function in aiding standard setting decisions. The Cut-Score Operating Function shows the relation between the cut-score chosen and the consequent error rate. It allows error rates to be defined by multiple loss functions and will show the behavior of each loss…
Descriptors: Cutting Scores, Standard Setting (Scoring), Decision Making, Error Patterns
Foley, Brett P. – Practical Assessment, Research & Evaluation, 2016
There is always a chance that examinees will answer multiple choice (MC) items correctly by guessing. Design choices in some modern exams have created situations where guessing at random through the full exam--rather than only for a subset of items where the examinee does not know the answer--can be an effective strategy to pass the exam. This…
Descriptors: Guessing (Tests), Multiple Choice Tests, Case Studies, Test Construction
Jiang, Yu; Zhang, Jiahui; Xin, Tao – Journal of Educational and Behavioral Statistics, 2019
This article is an overview of the National Assessment of Education Quality (NAEQ) of China in reading, mathematics, sciences, arts, physical education, and moral education at Grades 4 and 8. After a review of the background and history of NAEQ, we present the assessment framework with students' holistic development at the core and the design for…
Descriptors: Foreign Countries, Educational Quality, Educational Improvement, National Competency Tests
Margolis, Melissa J.; Clauser, Brian E. – Educational Measurement: Issues and Practice, 2014
This research evaluated the impact of a common modification to Angoff standard-setting exercises: the provision of examinee performance data. Data from 18 independent standard-setting panels across three different medical licensing examinations were examined to investigate whether and how the provision of performance information impacted judgments…
Descriptors: Cutting Scores, Standard Setting (Scoring), Data, Licensing Examinations (Professions)
Wyse, Adam E. – Educational Measurement: Issues and Practice, 2015
This article uses data from a large-scale assessment program to illustrate the potential issue of range restriction with the Bookmark method in the context of trying to set cut scores to closely align with a set of college and career readiness benchmarks. Analyses indicated that range restriction issues existed across different response…
Descriptors: Cutting Scores, Alignment (Education), College Readiness, Career Readiness
Nebraska Department of Education, 2018
The 2018 Nebraska Student-Centered Assessment System (NSCAS) Summative technical report documents the processes and procedures implemented to support the Spring 2018 NSCAS Summative English Language Arts (ELA), Mathematics, and Science assessments by NWEA under the supervision of the Nebraska Department of Education (NDE). The technical report…
Descriptors: Summative Evaluation, Language Tests, English, Mathematics Tests
Eckes, Thomas – Language Testing, 2017
This paper presents an approach to standard setting that combines the prototype group method (PGM; Eckes, 2012) with a receiver operating characteristic (ROC) analysis. The combined PGM-ROC approach is applied to setting cut scores on a placement test of English as a foreign language (EFL). To implement the PGM, experts first named learners whom…
Descriptors: English (Second Language), Language Tests, Cutting Scores, Standard Setting (Scoring)
Kane, Michael T.; Tannenbaum, Richard J. – Measurement: Interdisciplinary Research and Perspectives, 2013
The authors observe in this commentary that construct maps can help standard-setting panels to make realistic and internally consistent recommendations for performance-level descriptions (PLDs) and cut-scores, but the benefits may not be realized if policymakers do not fully understand the rationale for the recommendations provided by the…
Descriptors: Standard Setting (Scoring), Maps, Cutting Scores, Policy
Bunch, Michael B. – Measurement: Interdisciplinary Research and Perspectives, 2013
In this issue of "Measurement: Interdisciplinary Research and Perspectives," Adam E. Wyse provides a thorough review of research to date on the use of construct maps in standard setting. He juxtaposes concepts and methods in ways that make their connections to one another clearer and more obvious than they might otherwise have been. In…
Descriptors: Standard Setting (Scoring), Maps, Validity, Design
Davis-Becker, Susan – Measurement: Interdisciplinary Research and Perspectives, 2013
In his article "Construct Maps for Standard Setting," Adam E. Wyse provides a detailed review on the current use of construct maps in standard setting, including how they may be operationalized within a variety of standard-setting methods. The premise of the argument is that construct maps can serve as a useful tool for conducting a…
Descriptors: Standard Setting (Scoring), Maps, Sample Size, Comprehension
Clauser, Jerome C.; Clauser, Brian E.; Hambleton, Ronald K. – Applied Measurement in Education, 2014
The purpose of the present study was to extend past work with the Angoff method for setting standards by examining judgments at the judge level rather than the panel level. The focus was on investigating the relationship between observed Angoff standard setting judgments and empirical conditional probabilities. This relationship has been used as a…
Descriptors: Standard Setting (Scoring), Validity, Reliability, Correlation
Margolis, Melissa J.; Mee, Janet; Clauser, Brian E.; Winward, Marcia; Clauser, Jerome C. – Educational Measurement: Issues and Practice, 2016
Evidence to support the credibility of standard setting procedures is a critical part of the validity argument for decisions made based on tests that are used for classification. One area in which there has been limited empirical study is the impact of standard setting judge selection on the resulting cut score. One important issue related to…
Descriptors: Academic Standards, Standard Setting (Scoring), Cutting Scores, Credibility