ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	13

Descriptor

Interrater Reliability	60
Standard Setting (Scoring)	60
Cutting Scores	30
Standards	23
Evaluators	20
Scoring	18
Test Items	18
Evaluation Methods	12
Higher Education	12
Judges	12
Minimum Competency Testing	12
Difficulty Level	11
Criterion Referenced Tests	10
Performance Based Assessment	10
Educational Assessment	9
Elementary Secondary Education	9
Licensing Examinations…	9
Error of Measurement	8
Minimum Competencies	8
Multiple Choice Tests	8
Comparative Analysis	7
Generalizability Theory	6
Item Analysis	6
Item Response Theory	6
Mathematics Tests	6
More ▼

Source

Educational and Psychological…	7
Applied Measurement in…	6
Educational Measurement:…	5
Journal of Educational…	2
Measurement:…	2
Advances in Health Sciences…	1
Alberta Journal of…	1
Assessment in Education:…	1
Evaluation and the Health…	1
International Journal of…	1
Journal of Educational and…	1
Language Assessment Quarterly	1
Language Testing	1
Online Submission	1
Practical Assessment,…	1
Review of Educational Research	1
More ▼

Publication Type

Reports - Research	35
Journal Articles	32
Speeches/Meeting Papers	27
Reports - Evaluative	25
Information Analyses	2
Opinion Papers	2
Reports - Descriptive	1

Education Level

Elementary Education	4
Intermediate Grades	4
Elementary Secondary Education	2
Grade 6	2
Grade 4	1
Grade 5	1
Grade 7	1
Higher Education	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1
More ▼

Audience

Researchers

Location

Taiwan	2
Australia	1
California	1
New Jersey	1
Pennsylvania	1
United Kingdom (Liverpool)	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	3
Alabama High School…	1
National Teacher Examinations	1
Praxis Series	1

What Works Clearinghouse Rating

Showing 1 to 15 of 60 results Save | Export

Rounding in Angoff Ratings

Peer reviewed
PDF on ERIC

Download full text

Wyse, Adam E. – Practical Assessment, Research & Evaluation, 2018

One common modification to the Angoff standard-setting method is to have panelists round their ratings to the nearest 0.05 or 0.10 instead of 0.01. Several reasons have been offered as to why it may make sense to have panelists round their ratings to the nearest 0.05 or 0.10. In this article, we examine one reason that has been suggested, which is…

Descriptors: Interrater Reliability, Evaluation Criteria, Scoring Formulas, Achievement Rating

Regression Effects in Angoff Ratings: Examples from Credentialing Exams

Peer reviewed

Direct link

Wyse, Adam E. – Applied Measurement in Education, 2018

This article discusses regression effects that are commonly observed in Angoff ratings where panelists tend to think that hard items are easier than they are and easy items are more difficult than they are in comparison to estimated item difficulties. Analyses of data from two credentialing exams illustrate these regression effects and the…

Descriptors: Regression (Statistics), Test Items, Difficulty Level, Licensing Examinations (Professions)

Non-Numeric Intrajudge Consistency Feedback in an Angoff Procedure

Peer reviewed

Direct link

Harrison, George M. – Journal of Educational Measurement, 2015

The credibility of standard-setting cut scores depends in part on two sources of consistency evidence: intrajudge and interjudge consistency. Although intrajudge consistency feedback has often been provided to Angoff judges in practice, more evidence is needed to determine whether it achieves its intended effect. In this randomized experiment with…

Descriptors: Interrater Reliability, Standard Setting (Scoring), Cutting Scores, Feedback (Response)

Assessing the Viability of External Searchable Resources on the American Board of Family Medicine's Certification Examination

Download full text

O'Neill, Thomas R.; Peabody, Michael R.; Stelter, Keith L.; Hagen, Michael D. – Online Submission, 2015

(Purpose) The purpose of our study was to assess the need for an external searchable resource to be used in conjunction with the American Board of Family Medicine's (ABFM) Maintenance of Certification for Family Physicians (MC-FP) Examination, discuss the philosophical question of whether an ESR should be allowed on the examination, and outline…

Descriptors: Licensing Examinations (Professions), Family Practice (Medicine), Physicians, Online Searching

Coming Full Circle in Standard Setting: A Commentary on Wyse

Peer reviewed

Direct link

Skaggs, Gary – Measurement: Interdisciplinary Research and Perspectives, 2013

The construct map is a particularly good way to approach instrument development, and this author states that he was delighted to read Adam Wyse's thoughts about how to use construct maps for standard setting. For a number of popular standard-setting methods, Wyse shows how typical feedback to panelists fits within a construct map framework.…

Descriptors: Standard Setting (Scoring), Maps, Test Construction, Measurement

A Body of Work Standard-Setting Method with Construct Maps

Peer reviewed

Direct link

Wyse, Adam E.; Bunch, Michael B.; Deville, Craig; Viger, Steven G. – Educational and Psychological Measurement, 2014

This article describes a novel variation of the Body of Work method that uses construct maps to overcome problems of transparency, rater inconsistency, and scores gaps commonly occurring with the Body of Work method. The Body of Work method with construct maps was implemented to set cut-scores for two separate K-12 assessment programs in a large…

Descriptors: Standard Setting (Scoring), Educational Assessment, Elementary Secondary Education, Measurement

An Application of Multifaceted Rasch Measurement in the Yes/No Angoff Standard Setting Procedure

Peer reviewed

Direct link

Hsieh, Mingchuan – Language Testing, 2013

When implementing standard setting procedures, there are two major concerns: variance between panelists and efficiency in conducting multiple rounds of judgments. With regard to the former, there is concern over the consistency of the cutoff scores made by different panelists. If the cut scores show an inordinately wide range then further rounds…

Descriptors: Item Response Theory, Standard Setting (Scoring), Language Tests, English (Second Language)

Construct Maps as a Foundation for Standard Setting

Peer reviewed

Direct link

Wyse, Adam E. – Measurement: Interdisciplinary Research and Perspectives, 2013

Construct maps are tools that display how the underlying achievement construct upon which one is trying to set cut-scores is related to other information used in the process of standard setting. This article reviews what construct maps are, uses construct maps to provide a conceptual framework to view commonly used standard-setting procedures (the…

Descriptors: Standard Setting (Scoring), Maps, Cutting Scores, Methods

Comparing Yes/No Angoff and Bookmark Standard Setting Methods in the Context of English Assessment

Peer reviewed

Direct link

Hsieh, Mingchuan – Language Assessment Quarterly, 2013

The Yes/No Angoff and Bookmark method for setting standards on educational assessment are currently two of the most popular standard-setting methods. However, there is no research into the comparability of these two methods in the context of language assessment. This study compared results from the Yes/No Angoff and Bookmark methods as applied to…

Descriptors: Standard Setting (Scoring), Comparative Analysis, Language Tests, Multiple Choice Tests

The Centrality of Teachers' Judgement Practice in Assessment: A Study of Standards in Moderation

Peer reviewed

Direct link

Wyatt-Smith, Claire; Klenowski, Val; Gunn, Stephanie – Assessment in Education: Principles, Policy & Practice, 2010

There is a strong quest in several countries including Australia for greater national consistency in education and intensifying interest in standards for reporting. Given this, it is important to make explicit the intended and unintended consequences of assessment reform strategies and the pressures to pervert and conform. In a policy context that…

Descriptors: Foreign Countries, Student Evaluation, Teacher Attitudes, Teachers

Estimating the Minimum Number of Judges Required for Test-Centred Standard Setting on Written Assessments. Do Discussion and Iteration Have an Influence?

Peer reviewed

Direct link

Fowell, S. L.; Fewtrell, R.; McLaughlin, P. J. – Advances in Health Sciences Education, 2008

Absolute standard setting procedures are recommended for assessment in medical education. Absolute, test-centred standard setting procedures were introduced for written assessments in the Liverpool MBChB in 2001. The modified Angoff and Ebel methods have been used for short answer question-based and extended matching question-based papers,…

Descriptors: Medical Education, Standard Setting (Scoring), Judges, Interrater Reliability

Objective Standard Setting for Judge-Mediated Examinations

Peer reviewed

Direct link

Stone, Gregory Ethan; Beltyukova, Svetlana; Fox, Christine M. – International Journal of Testing, 2008

Judge-mediated examinations are defined as those for which expert evaluation (using rubrics) is required to determine correctness, completeness, and reasonability of test-taker responses. The use of multifaceted Rasch modeling has led to improvements in the reliability of scoring such examinations. The establishment of criterion-referenced…

Descriptors: Interrater Reliability, High Stakes Tests, Standard Setting, Minimum Competencies

How Many Raters Should Be Used for Establishing Cutoff Scores with the Angoff Method? A Generalizability Theory Study.

Peer reviewed

Hurtz, Gregory M.; Hertz, Norman R. – Educational and Psychological Measurement, 1999

Evaluated Angoff ratings from eight different occupational licensing examinations through generalizability theory to estimate the optimal number of raters. Results indicate that approximately 10 to 15 raters is an optimal target range. (SLD)

Descriptors: Cutting Scores, Evaluators, Generalizability Theory, Interrater Reliability

Setting Cut-Scores for Complex Performance Assessments: A Critical Examination of the Analytic Judgment Method

Peer reviewed

Direct link

Abbott, Marilyn L. – Alberta Journal of Educational Research, 2006

The purpose of this article is to promote an increased awareness of the processes for setting cut-scores for complex performance assessments by (a) describing the Analytic Judgment Method (AJM) for setting cut-scores, and (b) critically evaluating the technical adequacy and practicability of the AJM by focusing on one investigation where the AJM…

Descriptors: Interrater Reliability, Cutting Scores, Performance Based Assessment, Standard Setting (Scoring)

Reconciling Experts' Differences in Setting Cut Scores for Pass-Fail Decisions.

Peer reviewed

Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 1996

Data from two standard-setting exercises were analyzed using the logistic regression model that assumes no variation in severity of raters, and results were compared with those obtained by logistic regression that allowed for severity variation. Results illustrate the importance of taking between-rater differences into account. (SLD)

Descriptors: Cutting Scores, Decision Making, Evaluators, Individual Differences

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Plake, Barbara S.	6
Jaeger, Richard M.	4
Wyse, Adam E.	4
Chang, Lei	3
Busch, John Christian	2
Cope, Ronald T.	2
Hambleton, Ronald K.	2
Hsieh, Mingchuan	2
Linn, Robert L.	2
McGinty, Dixie	2
Neel, John H.	2
Reid, Jerry B.	2
Abbott, Marilyn L.	1
Angoff, William H.	1
Behuniak, Peter, Jr.	1
Beltyukova, Svetlana	1
Berk, Ronald A.	1
Bunch, Michael B.	1
De Champlain, Andre F.	1
DeMauro, Gerald E.	1
Deville, Craig	1
Fehrmann, Melinda L.	1
Fewtrell, R.	1
Fowell, S. L.	1
More ▼