ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	5

Descriptor

Evaluation Methods	15
Standards	10
Performance Based Assessment	8
Standard Setting (Scoring)	8
Decision Making	6
Evaluators	6
Teacher Evaluation	6
Educational Assessment	3
Elementary Secondary Education	3
National Competency Tests	3
State Standards	3
Academic Achievement	2
Accountability	2
Achievement Tests	2
Criteria	2
Cutting Scores	2
Federal Legislation	2
High School Students	2
Interrater Reliability	2
Judges	2
Mathematics Tests	2
Profiles	2
Research Needs	2
Scores	2
Secondary School Mathematics	2
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	15
Reports - Evaluative	7
Reports - Research	6
Information Analyses	4
Reports - Descriptive	1

Education Level

Elementary Secondary Education	2
High Schools	2
Grade 12	1

Audience

Location

Arizona	1
Florida	1
Kentucky	1
Maine	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Florida Comprehensive…

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Comparing Cut Scores from the Angoff Method and Two Variations of the Hofstee and Beuk Methods

Peer reviewed

Direct link

Wyse, Adam E. – Applied Measurement in Education, 2020

This article compares cut scores from two variations of the Hofstee and Beuk methods, which determine cut scores by resolving inconsistencies in panelists' judgments about cut scores and pass rates, with the Angoff method. The first variation uses responses to the Hofstee and Beuk percentage correct and pass rate questions to calculate cut scores.…

Descriptors: Cutting Scores, Evaluation Methods, Standard Setting (Scoring), Equations (Mathematics)

Is Teacher Value Added a Matter of Scale? The Practical Consequences of Treating an Ordinal Scale as Interval for Estimation of Teacher Effects

Peer reviewed

Direct link

Soland, James – Applied Measurement in Education, 2017

Research shows that assuming a test scale is equal-interval can be problematic, especially when the assessment is being used to achieve a policy aim like evaluating growth over time. However, little research considers whether teacher value added is sensitive to the underlying test scale, and in particular whether treating an ordinal scale as…

Descriptors: Intervals, Value Added Models, Teacher Evaluation, Teacher Effectiveness

The Rating and Matching Item-Objective Alignment Methods

Peer reviewed

Direct link

D'Agostino, Jerome V.; Welsh, Megan E.; Cimetta, Adriana D.; Falco, Lia D.; Smith, Shannon; VanWinkle, Waverely Hester; Powers, Sonya J. – Applied Measurement in Education, 2008

Central to the standards-based assessment validation process is an examination of the alignment between state standards and test items. Several alignment analysis systems have emerged recently, but most rely on either traditional rating or matching techniques. Little, if any, analyses have been reported on the degree of consistency between the two…

Descriptors: Test Items, Student Evaluation, State Standards, Evaluation Methods

Evaluation of the Standard Setting on the 2005 Grade 12 National Assessment of Educational Progress Mathematics Test

Peer reviewed

Direct link

Sireci, Stephen G.; Hauger, Jeffrey B.; Wells, Craig S.; Shea, Christine; Zenisky, April L. – Applied Measurement in Education, 2009

The National Assessment Governing Board used a new method to set achievement level standards on the 2005 Grade 12 NAEP Math test. In this article, we summarize our independent evaluation of the process used to set these standards. The evaluation data included observations of the standard-setting meeting, observations of advisory committee meetings…

Descriptors: Advisory Committees, Mathematics Tests, Standard Setting, National Competency Tests

Something Old, Something New, Something Borrowed, a Lot to Do.

Peer reviewed

Berk, Ronald A. – Applied Measurement in Education, 1995

A brief summary of standard setting knowledge is presented, derived from about 20 methods that utilize a judgmental review process, the approach most relevant to the standard-setting strategies proposed in this special issue. Criteria for judging effectiveness and critiques of the methods discussed in the issue are offered. (SLD)

Descriptors: Criteria, Decision Making, Educational History, Evaluation Methods

Using an Extended Angoff Procedure to Set Standards on Complex Performance Assessments.

Peer reviewed

Hambleton, Ronald K.; Plake, Barbara S. – Applied Measurement in Education, 1995

Several extensions to the Angoff method of standard setting are described that can accommodate characteristics of performance-based assessment. A study involving 12 panelists supported the effectiveness of the new approach but suggested that panelists preferred an approach that was at least partially conjunctive. (SLD)

Descriptors: Educational Assessment, Evaluation Methods, Evaluators, Interrater Reliability

Comments on Methods of Setting Standards for Complex Performance Tasks.

Peer reviewed

Mills, Craig N. – Applied Measurement in Education, 1995

The articles of this special issue propose two methods of deriving an initial standard and one method for determining the extent to which the standard should include compensation. Much work remains to be done on further development of the methods and the larger issues of policy regarding performance assessment. (SLD)

Descriptors: Decision Making, Educational Policy, Evaluation Methods, Evaluators

Setting Performance Standards through Two-Stage Judgmental Policy Capturing.

Peer reviewed

Jaeger, Richard M. – Applied Measurement in Education, 1995

A performance-standard setting procedure termed judgmental policy capturing (JPC) and its application are described. A study involving 12 panelists demonstrated the feasibility of the JPC method for setting performance standards for classroom teachers seeking certification from the National Board for Professional Teaching Standards. (SLD)

Descriptors: Decision Making, Educational Assessment, Evaluation Methods, Evaluators

An Integration and Reprise: What We Think We Have Learned.

Peer reviewed

Plake, Barbara S. – Applied Measurement in Education, 1995

The three standard-setting approaches described in this special issue are summarized and contrasted: (1) judgmental policy capturing; (2) the extended Angoff method; and (3) the dominant profile method. An integrative summary of findings is followed by recommendations for modifying the methods. (SLD)

Descriptors: Decision Making, Elementary Secondary Education, Evaluation Methods, Evaluators

Do National and State Assessments Converge for Educational Accountability? A Meta-Analytic Synthesis of Multiple Measures in Maine and Kentucky

Peer reviewed

Direct link

Lee, Jaekyung – Applied Measurement in Education, 2007

Given the policy imperative of using multiple measures for state education accountability under the No Child Left Behind Act (NCLB), this study examines similarities and discrepancies between the National Assessment of Educational Progress (NAEP) and the states' own math assessment results in Kentucky and Maine, with a focus on 3 major academic…

Descriptors: Academic Achievement, Federal Legislation, Achievement Gains, National Competency Tests

A Multi-Stage Dominant Profile Method for Setting Standards on Complex Performance Assessments.

Peer reviewed

Putnam, Sarah E.; And Others – Applied Measurement in Education, 1995

Development of a multistage dominant profile method for setting standards on complex performance assessments is detailed. The method grew from experiences with a judgmental policy-capturing procedure and an extended Angoff method. The design of an early adolescence English language arts assessment illustrates the complexity of decisions panelists…

Descriptors: Adolescents, Decision Making, Elementary Secondary Education, Evaluation Methods

The Performance Domain and the Structure of the Decision Space.

Peer reviewed

Plake, Barbara S. – Applied Measurement in Education, 1995

This article provides a framework for the rest of the articles in this special issue comparing the utility of three standard-setting methods with complex performance assessments. The context of the standard setting study is described, and the methods are outlined. (SLD)

Descriptors: Comparative Analysis, Criteria, Decision Making, Educational Assessment

Vertically Moderated Standards: Background, Assumptions, and Practices

Peer reviewed

Direct link

Huynh, Huynh; Schneider, Christina – Applied Measurement in Education, 2005

Developmental (vertical) scales are often constructed for subject areas such as reading and mathematics that are taught continuously in elementary schools. In other subjects such as science, and across a wider grade span, such scales are hard to justify. For tracking student progress and school accountability (including the No Child Left Behind…

Descriptors: Federal Legislation, National Competency Tests, Accountability, Academic Achievement

Student Test Score Reports and Interpretive Guides: Review of Current Practices and Suggestions for Future Research

Peer reviewed

Direct link

Goodman, Dean P.; Hambleton, Ronald K. – Applied Measurement in Education, 2004

A critical, but often neglected, component of any large-scale assessment program is the reporting of test results. In the past decade, a body of evidence has been compiled that raises concerns over the ways in which these results are reported to and understood by their intended audiences. In this study, current approaches for reporting…

Descriptors: Test Results, Student Evaluation, Scores, Testing Programs

Assessing Content Representativeness of Performance Assessment Exercises.

Peer reviewed

Crocker, Linda – Applied Measurement in Education, 1997

The experience of the National Board for Professional Teaching Standards illustrates how issues of assessing the content representativeness of performance assessment can be addressed to ensure validity for certification procedures. Explores the challenges of collecting validation evidence when expert judgments of content are used. (SLD)

Descriptors: Content Validity, Credentials, Data Collection, Evaluation Methods

Plake, Barbara S.	3
Hambleton, Ronald K.	2
Berk, Ronald A.	1
Cimetta, Adriana D.	1
Crocker, Linda	1
D'Agostino, Jerome V.	1
Falco, Lia D.	1
Goodman, Dean P.	1
Hauger, Jeffrey B.	1
Huynh, Huynh	1
Jaeger, Richard M.	1
Lee, Jaekyung	1
Mills, Craig N.	1
Powers, Sonya J.	1
Putnam, Sarah E.	1
Schneider, Christina	1
Shea, Christine	1
Sireci, Stephen G.	1
Smith, Shannon	1
Soland, James	1
VanWinkle, Waverely Hester	1
Wells, Craig S.	1
Welsh, Megan E.	1
Wyse, Adam E.	1
More ▼