ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	16
Since 2006 (last 20 years)	37

Descriptor

Standard Setting (Scoring)	92
Test Items	92
Cutting Scores	46
Difficulty Level	33
Standards	25
Licensing Examinations…	22
Scoring	21
Interrater Reliability	18
Test Construction	17
Criterion Referenced Tests	15
Judges	13
Minimum Competency Testing	13
Item Analysis	12
Evaluation Methods	11
Evaluators	11
Item Response Theory	11
Comparative Analysis	10
Higher Education	10
Mathematics Tests	10
Probability	10
Scores	10
Test Validity	10
Certification	8
Elementary Secondary Education	8
Error of Measurement	8
More ▼

Publication Type

Reports - Research	55
Journal Articles	44
Speeches/Meeting Papers	36
Reports - Evaluative	22
Reports - Descriptive	7
Guides - General	3
Tests/Questionnaires	3
Information Analyses	2
Opinion Papers	2
Dissertations/Theses -…	1
Guides - Classroom - Teacher	1
Guides - Non-Classroom	1
More ▼

Education Level

Secondary Education	7
Higher Education	5
Junior High Schools	5
Middle Schools	5
Postsecondary Education	5
Elementary Education	4
Elementary Secondary Education	3
Grade 8	3
Grade 5	2
Grade 11	1
Grade 3	1
Grade 4	1
Grade 7	1
High Schools	1
Intermediate Grades	1
More ▼

Audience

Researchers	4
Practitioners	1

Location

Tennessee	4
New Jersey	2
United Kingdom	2
California	1
Canada	1
China	1
Germany	1
Massachusetts	1
Taiwan	1
Turkey	1

Laws, Policies, & Programs

Comprehensive Education…	2
Education Consolidation…	1
No Child Left Behind Act 2001	1

Assessments and Surveys

National Assessment of…	4
National Teacher Examinations	2
Advanced Placement…	1
Massachusetts Comprehensive…	1
Praxis Series	1

What Works Clearinghouse Rating

Showing 1 to 15 of 92 results Save | Export

Embedding Embedded Standard Setting: An Application of Cross-Classified Item Response Theory. CRESST Report 876

Download full text

Yun-Kyung Kim; Li Cai – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2025

This paper introduces an application of cross-classified item response theory (IRT) modeling to an assessment utilizing the embedded standard setting (ESS) method (Lewis & Cook). The cross-classified IRT model is used to treat both item and person effects as random, where the item effects are regressed on the target performance levels (target…

Descriptors: Standard Setting (Scoring), Item Response Theory, Test Items, Difficulty Level

Setting and Validating Multiple Standards on a Multistage-Adaptive Test

Peer reviewed

Direct link

Lewis, Jennifer; Lim, Hwanggyu; Padellaro, Frank; Sireci, Stephen G.; Zenisky, April L. – Educational Measurement: Issues and Practice, 2022

Setting cut scores on (MSTs) is difficult, particularly when the test spans several grade levels, and the selection of items from MST panels must reflect the operational test specifications. In this study, we describe, illustrate, and evaluate three methods for mapping panelists' Angoff ratings into cut scores on the scale underlying an MST. The…

Descriptors: Cutting Scores, Adaptive Testing, Test Items, Item Analysis

Using Diagnostic Profiles to Describe Borderline Performance in Standard Setting

Peer reviewed

Direct link

Skaggs, Gary; Hein, Serge F.; Wilkins, Jesse L. M. – Educational Measurement: Issues and Practice, 2020

In test-centered standard-setting methods, borderline performance can be represented by many different profiles of strengths and weaknesses. As a result, asking panelists to estimate item or test performance for a hypothetical group study of borderline examinees, or a typical borderline examinee, may be an extremely difficult task and one that can…

Descriptors: Standard Setting (Scoring), Cutting Scores, Testing Problems, Profiles

Examining the Impact of a Consensus Approach to Content Alignment Studies

Peer reviewed
PDF on ERIC

Download full text

Russell, Michael; Moncaleano, Sebastian – Practical Assessment, Research & Evaluation, 2020

Although both content alignment and standard-setting procedures rely on content-expert panel judgements, only the latter employs discussion among panel members. This study employed a modified form of the Webb methodology to examine content alignment for twelve tests administered as part of the Massachusetts Comprehensive Assessment System (MCAS).…

Descriptors: Test Content, Test Items, Discussion, Test Validity

It's Not Just Angoff: Misperceptions of Hard and Easy Items in Bookmark-Type Ratings

Peer reviewed

Direct link

Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2020

A common belief is that the Bookmark method is a cognitively simpler standard-setting method than the modified Angoff method. However, a limited amount of research has investigated panelist's ability to perform well the Bookmark method, and whether some of the challenges panelists face with the Angoff method may also be present in the Bookmark…

Descriptors: Standard Setting (Scoring), Evaluation Methods, Testing Problems, Test Items

Comparison of Passing Scores Determined by the Angoff Method in Different Item Samples

Peer reviewed
PDF on ERIC

Download full text

Kara, Hakan; Cetin, Sevda – International Journal of Assessment Tools in Education, 2020

In this study, the efficiency of various random sampling methods to reduce the number of items rated by judges in an Angoff standard-setting study was examined and the methods were compared with each other. Firstly, the full-length test was formed by combining Placement Test 2012 and 2013 mathematics subsets. After then, simple random sampling…

Descriptors: Cutting Scores, Standard Setting (Scoring), Sampling, Error of Measurement

Comparing Small-Sample Equating with Angoff Judgement for Linking Cut-Scores on Two Tests

Download full text

Bramley, Tom – Research Matters, 2020

The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…

Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy

New Meridian Comparability Review Guidelines. Version 6.17.2020

Download full text

New Meridian Corporation, 2020

New Meridian Corporation has developed the "Quality Testing Standards and Criteria for Comparability Claims" (QTS). The goal of the QTS is to provide guidance to states that are interested in including content from the New Meridian item bank and intend to make comparability claims with "other assessments" that include New…

Descriptors: Testing, Standards, Comparative Analysis, Guidelines

Quality Testing Standards and Criteria for Comparability Claims. Version 6.17.2020

Download full text

New Meridian Corporation, 2020

Descriptors: Testing, Standards, Comparative Analysis, Guidelines

A Comparison of Two Standard-Setting Methods for Tests Consisting of Constructed-Response Items

Peer reviewed
PDF on ERIC

Download full text

Ozarkan, Hatun Betul; Dogan, Celal Deha – Eurasian Journal of Educational Research, 2020

Purpose: This study aimed to compare the cut scores obtained by the Extended Angoff and Contrasting Groups methods for an achievement test consisting of constructed-response items. Research Methods: This study was based on survey research design. In the collection of data, the study group of the research consisted of eight mathematics teachers for…

Descriptors: Standard Setting (Scoring), Responses, Test Items, Cutting Scores

The Cause-Effect Relation of Latent Variables in Scientific Multi-Text Reading Comprehension: Testing the Sequential Mediation Model

Peer reviewed
PDF on ERIC

Download full text

Hsiao-Hui Lin; Tzeng, Yuh-Tsuen; Chen, Hsueh-Chih; Huang, Yao-Hsuan – Reading & Writing: Journal of the Reading Association of South Africa, 2020

Background: The issue of science is seldom brought into focus because of the way developing assessments of students' multiple text reading comprehension. Objectives: This study tested the sequential mediation model of scientific multi-text reading comprehension (SMTRC) by means of structural equation modelling (SEM), and aimed to advance the…

Descriptors: Science Education, Reading Comprehension, Reading Tests, Construct Validity

Consistency of Angoff-Based Standard-Setting Judgments: Are Item Judgments and Passing Scores Replicable across Different Panels of Experts?

Peer reviewed

Direct link

Tannenbaum, Richard J.; Kannan, Priya – Educational Assessment, 2015

Angoff-based standard setting is widely used, especially for high-stakes licensure assessments. Nonetheless, some critics have claimed that the judgment task is too cognitively complex for panelists, whereas others have explicitly challenged the consistency in (replicability of) standard-setting outcomes. Evidence of consistency in item judgments…

Descriptors: Standard Setting (Scoring), Reliability, Scores, Licensing Examinations (Professions)

Regression Effects in Angoff Ratings: Examples from Credentialing Exams

Peer reviewed

Direct link

Wyse, Adam E. – Applied Measurement in Education, 2018

This article discusses regression effects that are commonly observed in Angoff ratings where panelists tend to think that hard items are easier than they are and easy items are more difficult than they are in comparison to estimated item difficulties. Analyses of data from two credentialing exams illustrate these regression effects and the…

Descriptors: Regression (Statistics), Test Items, Difficulty Level, Licensing Examinations (Professions)

"Quality Testing Standards" -- A Starter Kit for States. Version 6.17.2020

Download full text

New Meridian Corporation, 2020

New Meridian Corporation has developed the "Quality Testing Standards and Criteria for Comparability Claims" (QTS) to provide guidance to states that are interested in including New Meridian content and would like to either keep reporting scores on the New Meridian Scale or use the New Meridian performance levels; that is, the state…

Descriptors: Testing, Standards, Comparative Analysis, Test Content

Modeling for Directly Setting Theory-Based Performance Levels

Peer reviewed
PDF on ERIC

Download full text

Torres Irribarra, David; Diakow, Ronli; Freund, Rebecca; Wilson, Mark – Grantee Submission, 2015

This paper presents the Latent Class Level-PCM as a method for identifying and interpreting latent classes of respondents according to empirically estimated performance levels. The model, which combines elements from latent class models and reparameterized partial credit models for polytomous data, can simultaneously (a) identify empirical…

Descriptors: Item Response Theory, Test Items, Statistical Analysis, Models

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7

Applied Measurement in…	9
Educational Measurement:…	5
Educational and Psychological…	4
Journal of Educational…	4
Practical Assessment,…	4
International Journal of…	3
New Meridian Corporation	3
Evaluation and the Health…	2
Alberta Journal of…	1
Assessment & Evaluation in…	1
Educational Assessment	1
Educational Evaluation and…	1
Educational Sciences: Theory…	1
Eurasian Journal of…	1
Grantee Submission	1
Higher Education Studies	1
International Journal of…	1
Journal of Educational and…	1
Measurement:…	1
National Center for Research…	1
Online Submission	1
ProQuest LLC	1
Reading & Writing: Journal of…	1
Research Matters	1
More ▼

Plake, Barbara S.	11
Impara, James C.	5
Ferdous, Abdullah A.	4
Buckendahl, Chad W.	3
Chang, Lei	3
Hambleton, Ronald K.	3
Kannan, Priya	3
Tannenbaum, Richard J.	3
Wyse, Adam E.	3
Bowman, Harry L.	2
Clauser, Brian E.	2
Gerrow, Jack	2
Jaeger, Richard M.	2
Margolis, Melissa J.	2
McGinty, Dixie	2
Neel, John H.	2
Sgammato, Adrienne	2
Wright, Benjamin D.	2
Arce, Alvaro J.	1
Babcock, Ben	1
Beretvas, S. Natasha	1
Bramley, Tom	1
Brandon, Paul R.	1
Butler, E. Dean	1
More ▼