NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 46 to 60 of 505 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Benton, Tom; Elliott, Gill – Research Papers in Education, 2016
In recent years the use of expert judgement to set and maintain examination standards has been increasingly criticised in favour of approaches based on statistical modelling. This paper reviews existing research on this controversy and attempts to unify the evidence within a framework where expertise is utilised in the form of comparative…
Descriptors: Reliability, Expertise, Mathematical Models, Standard Setting (Scoring)
Peer reviewed Peer reviewed
Direct linkDirect link
Harrison, George M. – Journal of Educational Measurement, 2015
The credibility of standard-setting cut scores depends in part on two sources of consistency evidence: intrajudge and interjudge consistency. Although intrajudge consistency feedback has often been provided to Angoff judges in practice, more evidence is needed to determine whether it achieves its intended effect. In this randomized experiment with…
Descriptors: Interrater Reliability, Standard Setting (Scoring), Cutting Scores, Feedback (Response)
Peer reviewed Peer reviewed
Direct linkDirect link
Clauser, Jerome C.; Margolis, Melissa J.; Clauser, Brian E. – Journal of Educational Measurement, 2014
Evidence of stable standard setting results over panels or occasions is an important part of the validity argument for an established cut score. Unfortunately, due to the high cost of convening multiple panels of content experts, standards often are based on the recommendation from a single panel of judges. This approach implicitly assumes that…
Descriptors: Standard Setting (Scoring), Generalizability Theory, Replication (Evaluation), Cutting Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Zapp, Mike; Helgetun, Jo B.; Powell, Justin J. W. – European Journal of Education, 2018
Educational research in Norway has experienced unprecedented structural expansion and cognitive shifts over the last two decades because of greater state investments and the strategic use of extensive and multi-year thematic programmes to fund research projects. Using a neo-institutionalist framework, we examine institutionalisation dynamics in…
Descriptors: Foreign Countries, Educational Research, Institutional Characteristics, Organizational Change
Peer reviewed Peer reviewed
Direct linkDirect link
Jiang, Yu; Zhang, Jiahui; Xin, Tao – Journal of Educational and Behavioral Statistics, 2019
This article is an overview of the National Assessment of Education Quality (NAEQ) of China in reading, mathematics, sciences, arts, physical education, and moral education at Grades 4 and 8. After a review of the background and history of NAEQ, we present the assessment framework with students' holistic development at the core and the design for…
Descriptors: Foreign Countries, Educational Quality, Educational Improvement, National Competency Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Grabovsky, Irina; Wainer, Howard – Journal of Educational and Behavioral Statistics, 2017
In this essay, we describe the construction and use of the Cut-Score Operating Function in aiding standard setting decisions. The Cut-Score Operating Function shows the relation between the cut-score chosen and the consequent error rate. It allows error rates to be defined by multiple loss functions and will show the behavior of each loss…
Descriptors: Cutting Scores, Standard Setting (Scoring), Decision Making, Error Patterns
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Torres Irribarra, David; Diakow, Ronli; Freund, Rebecca; Wilson, Mark – Grantee Submission, 2015
This paper presents the Latent Class Level-PCM as a method for identifying and interpreting latent classes of respondents according to empirically estimated performance levels. The model, which combines elements from latent class models and reparameterized partial credit models for polytomous data, can simultaneously (a) identify empirical…
Descriptors: Item Response Theory, Test Items, Statistical Analysis, Models
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Foley, Brett P. – Practical Assessment, Research & Evaluation, 2016
There is always a chance that examinees will answer multiple choice (MC) items correctly by guessing. Design choices in some modern exams have created situations where guessing at random through the full exam--rather than only for a subset of items where the examinee does not know the answer--can be an effective strategy to pass the exam. This…
Descriptors: Guessing (Tests), Multiple Choice Tests, Case Studies, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Popham, W. James – Measurement: Interdisciplinary Research and Perspectives, 2013
The author recalls that as a child, he grooved on teeter-totters. Also known as a seesaw, a teeter-totter is a long, narrow board that's elevated with a pivot point in the middle so that as one end goes down the other end goes up. When going up or going down, sometimes quite rapidly, teeter-totters can provide their two riders with some…
Descriptors: Standard Setting (Scoring), Maps, Performance, Standards
Peer reviewed Peer reviewed
Direct linkDirect link
Margolis, Melissa J.; Clauser, Brian E. – Educational Measurement: Issues and Practice, 2014
This research evaluated the impact of a common modification to Angoff standard-setting exercises: the provision of examinee performance data. Data from 18 independent standard-setting panels across three different medical licensing examinations were examined to investigate whether and how the provision of performance information impacted judgments…
Descriptors: Cutting Scores, Standard Setting (Scoring), Data, Licensing Examinations (Professions)
Nebraska Department of Education, 2018
The 2018 Nebraska Student-Centered Assessment System (NSCAS) Summative technical report documents the processes and procedures implemented to support the Spring 2018 NSCAS Summative English Language Arts (ELA), Mathematics, and Science assessments by NWEA under the supervision of the Nebraska Department of Education (NDE). The technical report…
Descriptors: Summative Evaluation, Language Tests, English, Mathematics Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E. – Educational Measurement: Issues and Practice, 2015
This article uses data from a large-scale assessment program to illustrate the potential issue of range restriction with the Bookmark method in the context of trying to set cut scores to closely align with a set of college and career readiness benchmarks. Analyses indicated that range restriction issues existed across different response…
Descriptors: Cutting Scores, Alignment (Education), College Readiness, Career Readiness
Peer reviewed Peer reviewed
Direct linkDirect link
Eckes, Thomas – Language Testing, 2017
This paper presents an approach to standard setting that combines the prototype group method (PGM; Eckes, 2012) with a receiver operating characteristic (ROC) analysis. The combined PGM-ROC approach is applied to setting cut scores on a placement test of English as a foreign language (EFL). To implement the PGM, experts first named learners whom…
Descriptors: English (Second Language), Language Tests, Cutting Scores, Standard Setting (Scoring)
Peer reviewed Peer reviewed
Direct linkDirect link
Kane, Michael T.; Tannenbaum, Richard J. – Measurement: Interdisciplinary Research and Perspectives, 2013
The authors observe in this commentary that construct maps can help standard-setting panels to make realistic and internally consistent recommendations for performance-level descriptions (PLDs) and cut-scores, but the benefits may not be realized if policymakers do not fully understand the rationale for the recommendations provided by the…
Descriptors: Standard Setting (Scoring), Maps, Cutting Scores, Policy
Peer reviewed Peer reviewed
Direct linkDirect link
Bunch, Michael B. – Measurement: Interdisciplinary Research and Perspectives, 2013
In this issue of "Measurement: Interdisciplinary Research and Perspectives," Adam E. Wyse provides a thorough review of research to date on the use of construct maps in standard setting. He juxtaposes concepts and methods in ways that make their connections to one another clearer and more obvious than they might otherwise have been. In…
Descriptors: Standard Setting (Scoring), Maps, Validity, Design
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  34