ERIC - Search Results

Publication Date

In 2025	1
Since 2024	2
Since 2021 (last 5 years)	6
Since 2016 (last 10 years)	14
Since 2006 (last 20 years)	17

Descriptor

Evaluation Methods	25
Cutting Scores	9
Standard Setting (Scoring)	8
Student Evaluation	8
Standard Setting	6
Academic Standards	5
Standards	5
State Standards	5
Academic Achievement	4
Test Construction	4
Test Items	4
Criteria	3
Educational Objectives	3
Evaluators	3
Item Response Theory	3
School Districts	3
Accountability	2
Adaptive Testing	2
Alignment (Education)	2
Alternative Assessment	2
Educational Assessment	2
Elementary Secondary Education	2
Error of Measurement	2
Measurement Techniques	2
Models	2
More ▼

Source

Educational Measurement:…

Publication Type

Journal Articles	25
Reports - Research	10
Reports - Descriptive	7
Reports - Evaluative	6
Opinion Papers	3
Speeches/Meeting Papers	2

Education Level

Elementary Secondary Education	3
Adult Education	1

Audience

Location

Nebraska	3
New Hampshire	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

What Works Clearinghouse Rating

Showing 1 to 15 of 25 results Save | Export

Evaluating Panelists' Understanding of Standard Setting Data

Peer reviewed

Direct link

Baron, Patricia; Sireci, Stephen G.; Slater, Sharon C. – Educational Measurement: Issues and Practice, 2021

Since the No Child Left Behind Act (No Child Left Behind [NCLB], 2001) was enacted, the Bookmark method has been used in many state standard setting studies (Karantonis and Sireci; Zieky, Perie, and Livingston). The purpose of the current study is to evaluate the criticism that when panelists are presented with data during the Bookmark standard…

Descriptors: State Standards, Standard Setting, Evaluators, Training

An Application of Text Embeddings to Support Alignment of Educational Content Standards

Peer reviewed

Direct link

Reese Butterfuss; Harold Doran – Educational Measurement: Issues and Practice, 2025

Large language models are increasingly used in educational and psychological measurement activities. Their rapidly evolving sophistication and ability to detect language semantics make them viable tools to supplement subject matter experts and their reviews of large amounts of text statements, such as educational content standards. This paper…

Descriptors: Alignment (Education), Academic Standards, Content Analysis, Concept Mapping

Applying a Mixture Rasch Model-Based Approach to Standard Setting

Peer reviewed

Direct link

Peabody, Michael R.; Muckle, Timothy J.; Meng, Yu – Educational Measurement: Issues and Practice, 2023

The subjective aspect of standard-setting is often criticized, yet data-driven standard-setting methods are rarely applied. Therefore, we applied a mixture Rasch model approach to setting performance standards across several testing programs of various sizes and compared the results to existing passing standards derived from traditional…

Descriptors: Item Response Theory, Standard Setting, Testing, Sampling

Setting and Validating Multiple Standards on a Multistage-Adaptive Test

Peer reviewed

Direct link

Lewis, Jennifer; Lim, Hwanggyu; Padellaro, Frank; Sireci, Stephen G.; Zenisky, April L. – Educational Measurement: Issues and Practice, 2022

Setting cut scores on (MSTs) is difficult, particularly when the test spans several grade levels, and the selection of items from MST panels must reflect the operational test specifications. In this study, we describe, illustrate, and evaluate three methods for mapping panelists' Angoff ratings into cut scores on the scale underlying an MST. The…

Descriptors: Cutting Scores, Adaptive Testing, Test Items, Item Analysis

A Problem with the Bookmark Procedure's Correction for Guessing

Peer reviewed

Direct link

Baldwin, Peter – Educational Measurement: Issues and Practice, 2021

In the Bookmark standard-setting procedure, panelists are instructed to consider what examinees know rather than what they might attain by guessing; however, because examinees sometimes do guess, the procedure includes a correction for guessing. Like other corrections for guessing, the Bookmark's correction assumes that examinees either know the…

Descriptors: Guessing (Tests), Student Evaluation, Evaluation Methods, Standard Setting (Scoring)

The Choice of Response Probability in Bookmark Standard Setting: An Experimental Study

Peer reviewed

Direct link

Baldwin, Peter; Margolis, Melissa J.; Clauser, Brian E.; Mee, Janet; Winward, Marcia – Educational Measurement: Issues and Practice, 2020

Evidence of the internal consistency of standard-setting judgments is a critical part of the validity argument for tests used to make classification decisions. The bookmark standard-setting procedure is a popular approach to establishing performance standards, but there is relatively little research that reflects on the internal consistency of the…

Descriptors: Standard Setting (Scoring), Probability, Cutting Scores, Evaluation Methods

A Critical Look into the Beuk Standard-Setting Method

Peer reviewed

Direct link

Wyse, Adam E. – Educational Measurement: Issues and Practice, 2020

One commonly used compromise standard-setting method is the Beuk (1984) method. A key assumption of the Beuk method is that the emphasis given to the pass rate and the percent correct ratings should be proportional to the extent that the panelists agree on their ratings. However, whether the slope of Beuk line reflects the emphasis that panelists…

Descriptors: Standard Setting (Scoring), Cutting Scores, Weighted Scores, Evaluation Methods

It's Not Just Angoff: Misperceptions of Hard and Easy Items in Bookmark-Type Ratings

Peer reviewed

Direct link

Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2020

A common belief is that the Bookmark method is a cognitively simpler standard-setting method than the modified Angoff method. However, a limited amount of research has investigated panelist's ability to perform well the Bookmark method, and whether some of the challenges panelists face with the Angoff method may also be present in the Bookmark…

Descriptors: Standard Setting (Scoring), Evaluation Methods, Testing Problems, Test Items

Digital Module 07: Subscores--Evaluation and Reporting https://ncme.elevate.commpartners.com

Peer reviewed

Direct link

Sinharay, Sandip – Educational Measurement: Issues and Practice, 2019

Test score users often demand the reporting of subscores due to their potential diagnostic, remedial, and instructional benefits. Therefore, there is substantial pressure on testing programs to report subscores. However, professional standards require that subscores have to satisfy minimum quality standards before they can be reported. In this…

Descriptors: Testing, Scores, Item Response Theory, Evaluation Methods

Embedded Standard Setting: Aligning Standard-Setting Methodology with Contemporary Assessment Design Principles

Peer reviewed

Direct link

Lewis, Daniel; Cook, Robert – Educational Measurement: Issues and Practice, 2020

In this paper we assert that the practice of principled assessment design renders traditional standard-setting methodology redundant at best and contradictory at worst. We describe the rationale for, and methodological details of, Embedded Standard Setting (ESS; previously, Engineered Cut Scores. Lewis, 2016), an approach to establish performance…

Descriptors: Standard Setting, Evaluation, Cutting Scores, Performance Based Assessment

Evolving Educational Testing to Meet Students' Needs: Design-in-Real-Time Assessment

Peer reviewed

Direct link

Stephen G. Sireci; Javier Suárez-Álvarez; April L. Zenisky; Maria Elena Oliveri – Educational Measurement: Issues and Practice, 2024

The goal in personalized assessment is to best fit the needs of each individual test taker, given the assessment purposes. Design-in-Real-Time (DIRTy) assessment reflects the progressive evolution in testing from a single test, to an adaptive test, to an adaptive assessment "system." In this article, we lay the foundation for DIRTy…

Descriptors: Educational Assessment, Student Needs, Test Format, Test Construction

Designing Knowledge-in-Use Assessments to Promote Deeper Learning

Peer reviewed

Direct link

Harris, Christopher J.; Krajcik, Joseph S.; Pellegrino, James W.; DeBarger, Angela Haydel – Educational Measurement: Issues and Practice, 2019

Contemporary views on learning highlight that deep learning occurs not simply by accumulating knowledge, but by using and applying knowledge as one engages in disciplinary activity. Increasingly, those concerned with education policy and practice are shifting priorities toward supporting deeper learning by emphasizing the importance of students'…

Descriptors: Measurement, Learning Processes, Standards, Science Education

An Investigation of Undefined Cut Scores with the Hofstee Standard-Setting Method

Peer reviewed

Direct link

Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2017

This article provides an overview of the Hofstee standard-setting method and illustrates several situations where the Hofstee method will produce undefined cut scores. The situations where the cut scores will be undefined involve cases where the line segment derived from the Hofstee ratings does not intersect the score distribution curve based on…

Descriptors: Cutting Scores, Evaluation Methods, Standard Setting (Scoring), Comparative Analysis

Comparability in Balanced Assessment Systems for State Accountability

Peer reviewed

Direct link

Evans, Carla M.; Lyons, Susan – Educational Measurement: Issues and Practice, 2017

The purpose of this study was to test methods that strengthen the comparability claims about annual determinations of student proficiency in English language arts, math, and science (Grades 3-12) in the New Hampshire Performance Assessment of Competency Education (NH PACE) pilot project. First, we examined the literature in order to define…

Descriptors: Academic Achievement, Language Arts, Mathematics Achievement, Science Achievement

Links for Academic Learning (LAL): A Conceptual Model for Investigating Alignment of Alternate Assessments Based on Alternate Achievement Standards

Peer reviewed

Direct link

Flowers, Claudia; Wakeman, Shawnee; Browder, Diane M.; Karvonen, Meagan – Educational Measurement: Issues and Practice, 2009

This article describes an alignment procedure, called Links for Academic Learning (LAL), for examining the degree of alignment of alternate assessments based on alternate achievement standards (AA-AAS) to grade-level content standards and instruction. Although some of the alignment criteria are similar to those used in general education…

Descriptors: Academic Achievement, Criteria, Alignment (Education), Alternative Assessment

Previous Page | Next Page »

Pages: 1 | 2

Wyse, Adam E.	3
Babcock, Ben	2
Baldwin, Peter	2
Reckase, Mark D.	2
Sireci, Stephen G.	2
April L. Zenisky	1
Bandalos, Deborah L.	1
Baron, Patricia	1
Browder, Diane M.	1
Buckendahl, Chad W.	1
Clauser, Brian E.	1
Cook, Robert	1
DeBarger, Angela Haydel	1
Evans, Carla M.	1
Flowers, Claudia	1
Harold Doran	1
Harris, Christopher J.	1
Impara, James C.	1
Jaeger, Richard M.	1
Javier Suárez-Álvarez	1
Karvonen, Meagan	1
Krajcik, Joseph S.	1
Lewis, Daniel	1
Lewis, Jennifer	1
Lim, Hwanggyu	1
More ▼