ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	10

Descriptor

Standard Setting (Scoring)	24
Cutting Scores	15
Interrater Reliability	7
Probability	6
Standards	6
Comparative Analysis	5
Item Response Theory	5
Scoring	5
Difficulty Level	4
Error of Measurement	4
Evaluators	4
Higher Education	4
Reading Tests	4
Scores	4
Test Items	4
Evaluation Methods	3
Foreign Countries	3
Grade 8	3
Item Analysis	3
Mathematics Tests	3
Minimum Competency Testing	3
Multiple Choice Tests	3
Reliability	3
Test Results	3
Academic Achievement	2
More ▼

Source

Educational and Psychological…

Publication Type

Journal Articles	23
Reports - Research	11
Reports - Evaluative	9
Reports - Descriptive	3
Speeches/Meeting Papers	2
Opinion Papers	1

Education Level

Elementary Education	3
Elementary Secondary Education	2
Grade 8	2
Higher Education	2
Secondary Education	2
Grade 11	1
Grade 3	1
Grade 4	1
Grade 5	1
Grade 7	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
Primary Education	1
More ▼

Audience

Location

Australia	1
Canada	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Advanced Placement…

What Works Clearinghouse Rating

Showing 1 to 15 of 24 results Save | Export

The Response Vector for Mastery Method of Standard Setting

Peer reviewed

Direct link

Dimitrov, Dimiter M. – Educational and Psychological Measurement, 2022

Proposed is a new method of standard setting referred to as response vector for mastery (RVM) method. Under the RVM method, the task of panelists that participate in the standard setting process does not involve conceptualization of a borderline examinee and probability judgments as it is the case with the Angoff and bookmark methods. Also, the…

Descriptors: Standard Setting (Scoring), Cutting Scores, Computation, Mastery Learning

Relative Diagnostic Profile: A Subscore Reporting Framework

Peer reviewed

Direct link

Liu, Ren; Qian, Hong; Luo, Xiao; Woo, Ada – Educational and Psychological Measurement, 2018

Subscore reporting under item response theory models has always been a challenge partly because the test length of each subdomain is limited for precisely locating individuals on multiple continua. Diagnostic classification models (DCMs), providing a pass/fail decision and associated probability of pass on each subdomain, are promising…

Descriptors: Classification, Probability, Pass Fail Grading, Scores

A Body of Work Standard-Setting Method with Construct Maps

Peer reviewed

Direct link

Wyse, Adam E.; Bunch, Michael B.; Deville, Craig; Viger, Steven G. – Educational and Psychological Measurement, 2014

This article describes a novel variation of the Body of Work method that uses construct maps to overcome problems of transparency, rater inconsistency, and scores gaps commonly occurring with the Body of Work method. The Body of Work method with construct maps was implemented to set cut-scores for two separate K-12 assessment programs in a large…

Descriptors: Standard Setting (Scoring), Educational Assessment, Elementary Secondary Education, Measurement

Using the Many-Faceted Rasch Model to Evaluate Standard Setting Judgments: An Illustration with the Advanced Placement Environmental Science Exam

Peer reviewed

Direct link

Kaliski, Pamela K.; Wind, Stefanie A.; Engelhard, George, Jr.; Morgan, Deanna L.; Plake, Barbara S.; Reshetar, Rosemary A. – Educational and Psychological Measurement, 2013

The many-faceted Rasch (MFR) model has been used to evaluate the quality of ratings on constructed response assessments; however, it can also be used to evaluate the quality of judgments from panel-based standard setting procedures. The current study illustrates the use of the MFR model for examining the quality of ratings obtained from a standard…

Descriptors: Item Response Theory, Models, Standard Setting (Scoring), Science Tests

Reducing the Cognitive Complexity Associated with Standard Setting: A Comparison of the Single-Passage Bookmark and Yes/No Methods

Peer reviewed

Direct link

Skaggs, Gary; Hein, Serge F. – Educational and Psychological Measurement, 2011

Judgmental standard setting methods have been criticized for the cognitive complexity of the judgment task that panelists are asked to complete. This study compared two methods designed to reduce this complexity: the yes/no method and the single-passage bookmark method. Two mock standard setting panel meetings were convened, one for each method,…

Descriptors: Standard Setting (Scoring), Methods, Cutting Scores, Experienced Teachers

Peer reviewed

Direct link

Wyse, Adam E. – Educational and Psychological Measurement, 2011

Standard setting is a method used to set cut scores on large-scale assessments. One of the most popular standard setting methods is the Bookmark method. In the Bookmark method, panelists are asked to envision a response probability (RP) criterion and move through a booklet of ordered items based on a RP criterion. This study investigates whether…

Descriptors: Testing Programs, Standard Setting (Scoring), Cutting Scores, Probability

Comparing Construct Definition in the Angoff and Objective Standard Setting Models: Playing in a House of Cards without a Full Deck

Peer reviewed

Direct link

Stone, Gregory Ethan; Koskey, Kristin L. K.; Sondergeld, Toni A. – Educational and Psychological Measurement, 2011

Typical validation studies on standard setting models, most notably the Angoff and modified Angoff models, have ignored construct development, a critical aspect associated with all conceptualizations of measurement processes. Stone compared the Angoff and objective standard setting (OSS) models and found that Angoff failed to define a legitimate…

Descriptors: Cutting Scores, Standard Setting (Scoring), Models, Construct Validity

A Modification to Angoff and Bookmarking Cut Scores to Account for the Imperfect Reliability of Test Scores

Peer reviewed

Direct link

MacCann, Robert G. – Educational and Psychological Measurement, 2008

It is shown that the Angoff and bookmarking cut scores are examples of true score equating that in the real world must be applied to observed scores. In the context of defining minimal competency, the percentage "failed" by such methods is a function of the length of the measuring instrument. It is argued that this length is largely…

Descriptors: True Scores, Cutting Scores, Minimum Competencies, Scores

Item Response Theory-Based Approaches for Computing Minimum Passing Scores from an Angoff-Based Standard-Setting Study

Peer reviewed

Direct link

Ferdous, Abdullah A.; Plake, Barbara S. – Educational and Psychological Measurement, 2008

Even when the scoring of an examination is based on item response theory (IRT), standard-setting methods seldom use this information directly when determining the minimum passing score (MPS) for an examination from an Angoff-based standard-setting study. Often, when IRT scoring is used, the MPS value for a test is converted to an IRT-based theta…

Descriptors: Standard Setting (Scoring), Scoring, Cutting Scores, Item Response Theory

Item Selection Strategy for Reducing the Number of Items Rated in an Angoff Standard Setting Study

Peer reviewed

Direct link

Ferdous, Abdullah A.; Plake, Barbara S. – Educational and Psychological Measurement, 2007

In an Angoff standard setting procedure, judges estimate the probability that a hypothetical randomly selected minimally competent candidate will answer correctly each item in the test. In many cases, these item performance estimates are made twice, with information shared with the panelists between estimates. Especially for long tests, this…

Descriptors: Test Items, Probability, Item Analysis, Standard Setting (Scoring)

The Use of Subsets of Test Questions in an Angoff Standard-Setting Method

Peer reviewed

Direct link

Ferdous, Abdullah A.; Plake, Barbara S. – Educational and Psychological Measurement, 2005

In an Angoff standard-setting procedure, judges estimate the probability that a hypothetical randomly selected minimally competent candidate will answer correctly each item constituting the test. In many cases, these item performance estimates are made twice, with information shared with the judges between estimates. Especially for long tests,…

Descriptors: Test Items, Probability, Standard Setting (Scoring)

How Many Raters Should Be Used for Establishing Cutoff Scores with the Angoff Method? A Generalizability Theory Study.

Peer reviewed

Hurtz, Gregory M.; Hertz, Norman R. – Educational and Psychological Measurement, 1999

Evaluated Angoff ratings from eight different occupational licensing examinations through generalizability theory to estimate the optimal number of raters. Results indicate that approximately 10 to 15 raters is an optimal target range. (SLD)

Descriptors: Cutting Scores, Evaluators, Generalizability Theory, Interrater Reliability

Self-Correction of Wrong Answers as an Alternative to the Arbitrary Setting of Observed-Score Standards in Competency Testing.

Peer reviewed

Cahan, Sorel; Cohen, Nora – Educational and Psychological Measurement, 1990

A solution is offered to problems associated with the inequality in the manipulability of probabilities of classification errors of masters versus nonmasters, based on competency test results. Eschewing the typical arbitrary establishment of observed-score standards below 100 percent, the solution incorporates a self-correction of wrong answers.…

Descriptors: Classification, Error of Measurement, Mastery Tests, Minimum Competency Testing

A Preliminary Investigation of Two Procedures for Setting Examination Standards

Peer reviewed

Andrew, Barbara J.; Hecht, James T. – Educational and Psychological Measurement, 1976

Results suggest that different groups of judges do set similar examination standards when using the same procedure, and that the average of individual judgments does not differ significantly from group consensus judgments. Significant differences were found, however, between the standards set by the two procedures employed. (RC)

Descriptors: Comparative Analysis, Cutting Scores, Multiple Choice Tests, Pass Fail Grading

Evaluating the Quality of Ratings Obtained from Standard- Setting Judges.

Peer reviewed

Engelhard, George, Jr.; Stone, Gregory E. – Educational and Psychological Measurement, 1998

A new approach based on Rasch measurement theory is described for examining the quality of ratings from standard-setting judges. Ratings of nine judges for 213 items on a nursing examination show that judges vary in their views of the essential items for nursing certification, with statistically significant variability in the judged essentiality…

Descriptors: Certification, Evaluation Methods, Item Response Theory, Judges

Previous Page | Next Page »

Pages: 1 | 2

Plake, Barbara S.	6
Ferdous, Abdullah A.	3
Engelhard, George, Jr.	2
Wyse, Adam E.	2
Andrew, Barbara J.	1
Behuniak, Peter, Jr.	1
Bunch, Michael B.	1
Cahan, Sorel	1
Chang, Lei	1
Cohen, Nora	1
Deville, Craig	1
Dimitrov, Dimiter M.	1
Fehrmann, Melinda L.	1
Ferrara, Steven	1
Halpin, Gerald	1
Harasym, P. H.	1
Hecht, James T.	1
Hein, Serge F.	1
Hertz, Norman R.	1
Hurtz, Gregory M.	1
Kaliski, Pamela K.	1
Koskey, Kristin L. K.	1
Liu, Ren	1
Luo, Xiao	1
More ▼