ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	3
Since 2007 (last 20 years)	9

Descriptor

Evaluation Methods	35
Testing Problems	35
Scoring	25
Elementary Secondary Education	10
Test Reliability	8
Standard Setting (Scoring)	7
Student Evaluation	7
Achievement Tests	6
Educational Testing	6
Higher Education	6
Standardized Tests	6
Test Interpretation	6
Educational Assessment	5
Interrater Reliability	5
Multiple Choice Tests	5
Testing Programs	5
Writing Evaluation	5
Comparative Analysis	4
Evaluation Criteria	4
Performance Based Assessment	4
Response Style (Tests)	4
Scoring Formulas	4
Test Construction	4
Test Items	4
Test Use	4
More ▼

Source

College Teaching	2
Computers & Education	2
Educational Measurement:…	2
Journal of Educational…	2
Canadian Journal of Program…	1
Change: The Magazine of…	1
Educational Policy Analysis…	1
Educational Psychology Review	1
English Teaching Forum	1
Evaluation in Education:…	1
Instructional Science	1
MEXTESOL Journal	1
Psychology in the Schools	1
More ▼

Publication Type

Reports - Research	18
Journal Articles	17
Reports - Evaluative	10
Speeches/Meeting Papers	8
Opinion Papers	7
Guides - Non-Classroom	3
Reports - Descriptive	3
Tests/Questionnaires	2
Books	1
Information Analyses	1
Reference Materials -…	1
More ▼

Education Level

Higher Education	2
Postsecondary Education	2
Elementary Secondary Education	1
Secondary Education	1

Audience

Practitioners	2
Teachers	2
Researchers	1

Location

California (Los Angeles)	1
Canada	1
Mexico	1
United Kingdom (Scotland)	1

Laws, Policies, & Programs

Elementary and Secondary…	2
Education for All Handicapped…	1

Assessments and Surveys

Advanced Placement…	3
National Assessment of…	1
Wechsler Adult Intelligence…	1
Woodcock Johnson Tests of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 35 results Save | Export

Four Sobering Realities about Teaching Evaluation and Strategies to Address Them

Peer reviewed

Direct link

Glory Tobiason; Adrienne Lavine – Change: The Magazine of Higher Learning, 2025

Current methods for evaluating faculty teaching fall short, and one way to address this is through campus-wide initiatives that focus on change at the level of academic units. The complex context of higher education makes meaningful teaching evaluation difficult; in particular, four sobering realities of this context must be taken into account in…

Descriptors: Teacher Evaluation, Evaluation Methods, Testing Problems, Educational Change

It's Not Just Angoff: Misperceptions of Hard and Easy Items in Bookmark-Type Ratings

Peer reviewed

Direct link

Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2020

A common belief is that the Bookmark method is a cognitively simpler standard-setting method than the modified Angoff method. However, a limited amount of research has investigated panelist's ability to perform well the Bookmark method, and whether some of the challenges panelists face with the Angoff method may also be present in the Bookmark…

Descriptors: Standard Setting (Scoring), Evaluation Methods, Testing Problems, Test Items

Reconsidering the Assessment Policy: Practical Use of Liberal Multiple-Choice Tests (SAC Method)

Peer reviewed
PDF on ERIC

Download full text

Cesur, Kursat – Educational Policy Analysis and Strategic Research, 2019

Examinees' performances are assessed using a wide variety of different techniques. Multiple-choice (MC) tests are among the most frequently used ones. Nearly, all standardized achievement tests make use of MC test items and there is a variety of ways to score these tests. The study compares number right and liberal scoring (SAC) methods. Mixed…

Descriptors: Multiple Choice Tests, Scoring, Evaluation Methods, Guessing (Tests)

Assessment of Critical-Analytic Thinking

Peer reviewed

Direct link

Brown, Nathaniel J.; Afflerbach, Peter P.; Croninger, Robert G. – Educational Psychology Review, 2014

National policy and standards documents, including the National Assessment of Educational Progress frameworks, the "Common Core State Standards" and the "Next Generation Science Standards," assert the need to assess critical-analytic thinking (CAT) across subject areas. However, assessment of CAT poses several challenges for…

Descriptors: Critical Thinking, Thinking Skills, National Standards, National Competency Tests

Comparison of Oral Examination and Electronic Examination Using Paired Multiple-Choice Questions

Peer reviewed

Direct link

Ventouras, Errikos; Triantis, Dimos; Tsiakas, Panagiotis; Stergiopoulos, Charalampos – Computers & Education, 2011

The aim of the present research was to compare the use of multiple-choice questions (MCQs) as an examination method against the oral examination (OE) method. MCQs are widely used and their importance seems likely to grow, due to their inherent suitability for electronic assessment. However, MCQs are influenced by the tendency of examinees to guess…

Descriptors: Grades (Scholastic), Scoring, Multiple Choice Tests, Test Format

Twenty Common Testing Mistakes for EFL Teachers to Avoid

Download full text

Henning, Grant – English Teaching Forum, 2012

To some extent, good testing procedure, like good language use, can be achieved through avoidance of errors. Almost any language-instruction program requires the preparation and administration of tests, and it is only to the extent that certain common testing mistakes have been avoided that such tests can be said to be worthwhile selection,…

Descriptors: Testing, English (Second Language), Testing Problems, Student Evaluation

Judges' Use of Examinee Performance Data in an Angoff Standard-Setting Exercise for a Medical Licensing Examination: An Experimental Study

Peer reviewed

Direct link

Clauser, Brian E.; Mee, Janet; Baldwin, Su G.; Margolis, Melissa J.; Dillon, Gerard F. – Journal of Educational Measurement, 2009

Although the Angoff procedure is among the most widely used standard setting procedures for tests comprising multiple-choice items, research has shown that subject matter experts have considerable difficulty accurately making the required judgments in the absence of examinee performance data. Some authors have viewed the need to provide…

Descriptors: Standard Setting (Scoring), Program Effectiveness, Expertise, Health Personnel

Comparison of Examination Methods Based on Multiple-Choice Questions and Constructed-Response Questions Using Personal Computers

Peer reviewed

Direct link

Ventouras, Errikos; Triantis, Dimos; Tsiakas, Panagiotis; Stergiopoulos, Charalampos – Computers & Education, 2010

The aim of the present research was to compare the use of multiple-choice questions (MCQs) as an examination method, to the examination based on constructed-response questions (CRQs). Despite that MCQs have an advantage concerning objectivity in the grading process and speed in production of results, they also introduce an error in the final…

Descriptors: Computer Assisted Instruction, Scoring, Grading, Comparative Analysis

Monitoring Rater Performance over Time: A Framework for Detecting Differential Accuracy and Differential Scale Category Use

Peer reviewed

Direct link

Myford, Carol M.; Wolfe, Edward W. – Journal of Educational Measurement, 2009

In this study, we describe a framework for monitoring rater performance over time. We present several statistical indices to identify raters whose standards drift and explain how to use those indices operationally. To illustrate the use of the framework, we analyzed rating data from the 2002 Advanced Placement English Literature and Composition…

Descriptors: English Literature, Advanced Placement, Measures (Individuals), Writing (Composition)

Selection of Judges for Standard-Setting.

Peer reviewed

Jaeger, Richard M. – Educational Measurement: Issues and Practice, 1991

Issues concerning the selection of judges for standard setting are discussed. Determining the consistency of judges' recommendations, or their congruity with other expert recommendations, would help in selection. Enough judges must be chosen to allow estimation of recommendations by an entire population of judges. (SLD)

Descriptors: Cutting Scores, Evaluation Methods, Evaluators, Examiners

The Use and Effect of Caution Indices in Detecting Aberrant Patterns of Standard-Setting Recommendations.

Jaeger, Richard M.; Busch, John Christian – 1986

This study explores the use of the modified caution index (MCI) for identifying judges whose patterns of recommendations suggest that their judgments might be based on incomplete information, flawed reasoning, or inattention to their standard-setting tasks. It also examines the effect on test standards and passing rates when the test standards of…

Descriptors: Criterion Referenced Tests, Error of Measurement, Evaluation Methods, High Schools

The Visual Motor Integration Test: High Interjudge Reliability, High Potential For Diagnostic Error.

Peer reviewed

Snyder, Peggy P.; And Others – Psychology in the Schools, 1981

Investigated scoring agreement among three different training levels of Visual Motor Integration Test (VMI) diagnosticians. Correlational data demonstrated high interexaminer reliabilities; however, there were gross errors in precision after raw scores had been converted into VMI age equivalent scores. (Author/RC)

Descriptors: Educational Diagnosis, Evaluation Methods, Grade Equivalent Scores, Motor Development

Guidelines for the Management of Performance Assessments in Large-Scale Assessment Programs.

Download full text

Roeber, Edward D. – 1996

This paper is based on guidelines developed in 1989 for training workshops for state and local educators to demonstrate the processes by which performance assessments could be created, validated, and used in statewide assessment programs. These guidelines are based on work with the National Assessment of Educational Progress and several statewide…

Descriptors: Evaluation Methods, Performance Based Assessment, Sampling, Scoring

Issues in Standard Setting: Some Comments, Some Suggestions, and Maybe Even a Few Answers.

Download full text

Livingston, Samuel A. – 1983

Discussed are nine questions regarding standard setting issues in educational testing: (1) Should normative or content-referenced standards be used? (2) Different standard setting methods yield different results. Does this finding present a problem? (3) Assess the adequacy of the grounding of various methods of standard setting in psychological…

Descriptors: Educational Testing, Evaluation, Evaluation Methods, Measurement Objectives

Essentials of WJ III[TM] Tests of Achievement Assessment. Essentials of Psychological Assessment Series.

Mather, Nancy; Wendling, Barbara J.; Woodcock, Richard W. – 2001

The widely used Woodcock Johnson (WJ) Test of Achievement has been separated into two distinct tests, Achievement and Cognitive Abilities. This book is designed to help busy mental health professionals acquire the knowledge and skills they need to use the third revision of the WJ Tests of Achievement (WJ III ACH) , including administration,…

Descriptors: Academic Achievement, Achievement Tests, Adults, Children

Previous Page | Next Page »

Pages: 1 | 2 | 3

Jaeger, Richard M.	2
Stergiopoulos, Charalampos	2
Triantis, Dimos	2
Tsiakas, Panagiotis	2
Ventouras, Errikos	2
Adrienne Lavine	1
Afflerbach, Peter P.	1
Allison, Howard K., II	1
Babcock, Ben	1
Baldwin, Su G.	1
Bhaskar, R.	1
Brown, Nathaniel J.	1
Busch, John Christian	1
Cesur, Kursat	1
Clauser, Brian E.	1
Cole, Nancy S.	1
Crehan, Kevin	1
Crews, William E., Jr.	1
Croninger, Robert G.	1
Dillard, Jesse F.	1
Dillon, Gerard F.	1
Fagan, Barbara M.	1
Ferrara, Steven	1
Glory Tobiason	1
More ▼