ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	10
Since 2006 (last 20 years)	12

Descriptor

Evaluation Methods	14
Standard Setting (Scoring)	8
Cutting Scores	7
Scoring	6
Student Evaluation	4
Academic Achievement	2
Interrater Reliability	2
Psychometrics	2
Test Items	2
Testing Problems	2
Validity	2
Accountability	1
Adaptive Testing	1
Classification	1
Classroom Techniques	1
College Faculty	1
Comparative Analysis	1
Competence	1
Computation	1
Computer Assisted Testing	1
Concept Formation	1
Criterion Referenced Tests	1
Criticism	1
Decision Making	1
Difficulty Level	1
More ▼

Source

Educational Measurement:…

Publication Type

Journal Articles	14
Reports - Research	8
Reports - Evaluative	6
Opinion Papers	1
Speeches/Meeting Papers	1

Education Level

Elementary Secondary Education

Audience

Location

New Hampshire

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Examining the Psychometric Impact of Targeted and Random Double-Scoring in Mixed-Format Assessments

Peer reviewed

Direct link

Yangmeng Xu; Stefanie A. Wind – Educational Measurement: Issues and Practice, 2025

Double-scoring constructed-response items is a common but costly practice in mixed-format assessments. This study explored the impacts of Targeted Double-Scoring (TDS) and random double-scoring procedures on the quality of psychometric outcomes, including student achievement estimates, person fit, and student classifications under various…

Descriptors: Academic Achievement, Psychometrics, Scoring, Evaluation Methods

Setting and Validating Multiple Standards on a Multistage-Adaptive Test

Peer reviewed

Direct link

Lewis, Jennifer; Lim, Hwanggyu; Padellaro, Frank; Sireci, Stephen G.; Zenisky, April L. – Educational Measurement: Issues and Practice, 2022

Setting cut scores on (MSTs) is difficult, particularly when the test spans several grade levels, and the selection of items from MST panels must reflect the operational test specifications. In this study, we describe, illustrate, and evaluate three methods for mapping panelists' Angoff ratings into cut scores on the scale underlying an MST. The…

Descriptors: Cutting Scores, Adaptive Testing, Test Items, Item Analysis

A Problem with the Bookmark Procedure's Correction for Guessing

Peer reviewed

Direct link

Baldwin, Peter – Educational Measurement: Issues and Practice, 2021

In the Bookmark standard-setting procedure, panelists are instructed to consider what examinees know rather than what they might attain by guessing; however, because examinees sometimes do guess, the procedure includes a correction for guessing. Like other corrections for guessing, the Bookmark's correction assumes that examinees either know the…

Descriptors: Guessing (Tests), Student Evaluation, Evaluation Methods, Standard Setting (Scoring)

The Choice of Response Probability in Bookmark Standard Setting: An Experimental Study

Peer reviewed

Direct link

Baldwin, Peter; Margolis, Melissa J.; Clauser, Brian E.; Mee, Janet; Winward, Marcia – Educational Measurement: Issues and Practice, 2020

Evidence of the internal consistency of standard-setting judgments is a critical part of the validity argument for tests used to make classification decisions. The bookmark standard-setting procedure is a popular approach to establishing performance standards, but there is relatively little research that reflects on the internal consistency of the…

Descriptors: Standard Setting (Scoring), Probability, Cutting Scores, Evaluation Methods

A Critical Look into the Beuk Standard-Setting Method

Peer reviewed

Direct link

Wyse, Adam E. – Educational Measurement: Issues and Practice, 2020

One commonly used compromise standard-setting method is the Beuk (1984) method. A key assumption of the Beuk method is that the emphasis given to the pass rate and the percent correct ratings should be proportional to the extent that the panelists agree on their ratings. However, whether the slope of Beuk line reflects the emphasis that panelists…

Descriptors: Standard Setting (Scoring), Cutting Scores, Weighted Scores, Evaluation Methods

It's Not Just Angoff: Misperceptions of Hard and Easy Items in Bookmark-Type Ratings

Peer reviewed

Direct link

Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2020

A common belief is that the Bookmark method is a cognitively simpler standard-setting method than the modified Angoff method. However, a limited amount of research has investigated panelist's ability to perform well the Bookmark method, and whether some of the challenges panelists face with the Angoff method may also be present in the Bookmark…

Descriptors: Standard Setting (Scoring), Evaluation Methods, Testing Problems, Test Items

Assessment for Learning with Diverse Learners in a Digital World

Peer reviewed

Direct link

DiCerbo, Kristen – Educational Measurement: Issues and Practice, 2020

We have the ability to capture data from students' interactions with digital environments as they engage in learning activity. This provides the potential for a reimagining of assessment to one in which assessment become part of our natural education activity and can be used to support learning. These new data allow us to more closely examine the…

Descriptors: Student Diversity, Information Technology, Learning Activities, Learning Processes

The Value of Choice: An Experiment Using Multiple-Choice Tests

Peer reviewed

Direct link

Aray, Henry; Pedauga, Luis – Educational Measurement: Issues and Practice, 2019

This article presents a novel experimental methodology in which groups of students were offered the option to choose between two equivalent scoring rules to assess a multiple-choice test. The effect of choosing the scoring rule on marks is tested. Two major contributions arise from this research. First, it contributes to the literature on the…

Descriptors: Multiple Choice Tests, Scoring, Student Attitudes, Decision Making

An Investigation of Undefined Cut Scores with the Hofstee Standard-Setting Method

Peer reviewed

Direct link

Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2017

This article provides an overview of the Hofstee standard-setting method and illustrates several situations where the Hofstee method will produce undefined cut scores. The situations where the cut scores will be undefined involve cases where the line segment derived from the Hofstee ratings does not intersect the score distribution curve based on…

Descriptors: Cutting Scores, Evaluation Methods, Standard Setting (Scoring), Comparative Analysis

Comparability in Balanced Assessment Systems for State Accountability

Peer reviewed

Direct link

Evans, Carla M.; Lyons, Susan – Educational Measurement: Issues and Practice, 2017

The purpose of this study was to test methods that strengthen the comparability claims about annual determinations of student proficiency in English language arts, math, and science (Grades 3-12) in the New Hampshire Performance Assessment of Competency Education (NH PACE) pilot project. First, we examined the literature in order to define…

Descriptors: Academic Achievement, Language Arts, Mathematics Achievement, Science Achievement

A Critical Review of Some Qualitative Research Methods Used to Explore Rater Cognition

Peer reviewed

Direct link

Suto, Irenka – Educational Measurement: Issues and Practice, 2012

Internationally, many assessment systems rely predominantly on human raters to score examinations. Arguably, this facilitates the assessment of multiple sophisticated educational constructs, strengthening assessment validity. It can introduce subjectivity into the scoring process, however, engendering threats to accuracy. The present objectives…

Descriptors: Evaluation Methods, Scoring, Qualitative Research, Protocol Analysis

Selection of Judges for Standard-Setting.

Peer reviewed

Jaeger, Richard M. – Educational Measurement: Issues and Practice, 1991

Issues concerning the selection of judges for standard setting are discussed. Determining the consistency of judges' recommendations, or their congruity with other expert recommendations, would help in selection. Enough judges must be chosen to allow estimation of recommendations by an entire population of judges. (SLD)

Descriptors: Cutting Scores, Evaluation Methods, Evaluators, Examiners

A Conceptual Framework for a Psychometric Theory for Standard Setting with Examples of Its Use for Evaluating the Functioning of Two Standard Setting Methods

Peer reviewed

Direct link

Reckase, Mark D. – Educational Measurement: Issues and Practice, 2006

A conceptual framework is proposed for a psychometric theory of standard setting. The framework suggests that participants in a standard setting process (panelists) develop an internal, intended standard as a result of training and the participant's background. The goal of a standard setting process is to convert panelists' intended standards to…

Descriptors: Psychometrics, Standard Setting, Evaluation Criteria, Item Response Theory

Classroom Standard Setting and Grading Practices.

Peer reviewed

Terwilliger, James S. – Educational Measurement: Issues and Practice, 1989

The process of assigning grades to students is analyzed, and a specific approach to grading is recommended that distinguishes between minimal and developmental objectives. Criterion-referenced and norm-referenced concepts are used in the approach, which is best suited for secondary school or college. (SLD)

Descriptors: Classroom Techniques, College Faculty, Criterion Referenced Tests, Educational Objectives

Wyse, Adam E.	3
Babcock, Ben	2
Baldwin, Peter	2
Aray, Henry	1
Clauser, Brian E.	1
DiCerbo, Kristen	1
Evans, Carla M.	1
Jaeger, Richard M.	1
Lewis, Jennifer	1
Lim, Hwanggyu	1
Lyons, Susan	1
Margolis, Melissa J.	1
Mee, Janet	1
Padellaro, Frank	1
Pedauga, Luis	1
Reckase, Mark D.	1
Sireci, Stephen G.	1
Stefanie A. Wind	1
Suto, Irenka	1
Terwilliger, James S.	1
Winward, Marcia	1
Yangmeng Xu	1
Zenisky, April L.	1
More ▼