Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 9 |
Descriptor
| Evaluation Methods | 35 |
| Testing Problems | 35 |
| Scoring | 25 |
| Elementary Secondary Education | 10 |
| Test Reliability | 8 |
| Standard Setting (Scoring) | 7 |
| Student Evaluation | 7 |
| Achievement Tests | 6 |
| Educational Testing | 6 |
| Higher Education | 6 |
| Standardized Tests | 6 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Higher Education | 2 |
| Postsecondary Education | 2 |
| Elementary Secondary Education | 1 |
| Secondary Education | 1 |
Audience
| Practitioners | 2 |
| Teachers | 2 |
| Researchers | 1 |
Laws, Policies, & Programs
| Elementary and Secondary… | 2 |
| Education for All Handicapped… | 1 |
Assessments and Surveys
| Advanced Placement… | 3 |
| National Assessment of… | 1 |
| Wechsler Adult Intelligence… | 1 |
| Woodcock Johnson Tests of… | 1 |
What Works Clearinghouse Rating
Glory Tobiason; Adrienne Lavine – Change: The Magazine of Higher Learning, 2025
Current methods for evaluating faculty teaching fall short, and one way to address this is through campus-wide initiatives that focus on change at the level of academic units. The complex context of higher education makes meaningful teaching evaluation difficult; in particular, four sobering realities of this context must be taken into account in…
Descriptors: Teacher Evaluation, Evaluation Methods, Testing Problems, Educational Change
Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2020
A common belief is that the Bookmark method is a cognitively simpler standard-setting method than the modified Angoff method. However, a limited amount of research has investigated panelist's ability to perform well the Bookmark method, and whether some of the challenges panelists face with the Angoff method may also be present in the Bookmark…
Descriptors: Standard Setting (Scoring), Evaluation Methods, Testing Problems, Test Items
Cesur, Kursat – Educational Policy Analysis and Strategic Research, 2019
Examinees' performances are assessed using a wide variety of different techniques. Multiple-choice (MC) tests are among the most frequently used ones. Nearly, all standardized achievement tests make use of MC test items and there is a variety of ways to score these tests. The study compares number right and liberal scoring (SAC) methods. Mixed…
Descriptors: Multiple Choice Tests, Scoring, Evaluation Methods, Guessing (Tests)
Brown, Nathaniel J.; Afflerbach, Peter P.; Croninger, Robert G. – Educational Psychology Review, 2014
National policy and standards documents, including the National Assessment of Educational Progress frameworks, the "Common Core State Standards" and the "Next Generation Science Standards," assert the need to assess critical-analytic thinking (CAT) across subject areas. However, assessment of CAT poses several challenges for…
Descriptors: Critical Thinking, Thinking Skills, National Standards, National Competency Tests
Ventouras, Errikos; Triantis, Dimos; Tsiakas, Panagiotis; Stergiopoulos, Charalampos – Computers & Education, 2011
The aim of the present research was to compare the use of multiple-choice questions (MCQs) as an examination method against the oral examination (OE) method. MCQs are widely used and their importance seems likely to grow, due to their inherent suitability for electronic assessment. However, MCQs are influenced by the tendency of examinees to guess…
Descriptors: Grades (Scholastic), Scoring, Multiple Choice Tests, Test Format
Henning, Grant – English Teaching Forum, 2012
To some extent, good testing procedure, like good language use, can be achieved through avoidance of errors. Almost any language-instruction program requires the preparation and administration of tests, and it is only to the extent that certain common testing mistakes have been avoided that such tests can be said to be worthwhile selection,…
Descriptors: Testing, English (Second Language), Testing Problems, Student Evaluation
Clauser, Brian E.; Mee, Janet; Baldwin, Su G.; Margolis, Melissa J.; Dillon, Gerard F. – Journal of Educational Measurement, 2009
Although the Angoff procedure is among the most widely used standard setting procedures for tests comprising multiple-choice items, research has shown that subject matter experts have considerable difficulty accurately making the required judgments in the absence of examinee performance data. Some authors have viewed the need to provide…
Descriptors: Standard Setting (Scoring), Program Effectiveness, Expertise, Health Personnel
Ventouras, Errikos; Triantis, Dimos; Tsiakas, Panagiotis; Stergiopoulos, Charalampos – Computers & Education, 2010
The aim of the present research was to compare the use of multiple-choice questions (MCQs) as an examination method, to the examination based on constructed-response questions (CRQs). Despite that MCQs have an advantage concerning objectivity in the grading process and speed in production of results, they also introduce an error in the final…
Descriptors: Computer Assisted Instruction, Scoring, Grading, Comparative Analysis
Myford, Carol M.; Wolfe, Edward W. – Journal of Educational Measurement, 2009
In this study, we describe a framework for monitoring rater performance over time. We present several statistical indices to identify raters whose standards drift and explain how to use those indices operationally. To illustrate the use of the framework, we analyzed rating data from the 2002 Advanced Placement English Literature and Composition…
Descriptors: English Literature, Advanced Placement, Measures (Individuals), Writing (Composition)
Peer reviewedJaeger, Richard M. – Educational Measurement: Issues and Practice, 1991
Issues concerning the selection of judges for standard setting are discussed. Determining the consistency of judges' recommendations, or their congruity with other expert recommendations, would help in selection. Enough judges must be chosen to allow estimation of recommendations by an entire population of judges. (SLD)
Descriptors: Cutting Scores, Evaluation Methods, Evaluators, Examiners
Jaeger, Richard M.; Busch, John Christian – 1986
This study explores the use of the modified caution index (MCI) for identifying judges whose patterns of recommendations suggest that their judgments might be based on incomplete information, flawed reasoning, or inattention to their standard-setting tasks. It also examines the effect on test standards and passing rates when the test standards of…
Descriptors: Criterion Referenced Tests, Error of Measurement, Evaluation Methods, High Schools
The Visual Motor Integration Test: High Interjudge Reliability, High Potential For Diagnostic Error.
Peer reviewedSnyder, Peggy P.; And Others – Psychology in the Schools, 1981
Investigated scoring agreement among three different training levels of Visual Motor Integration Test (VMI) diagnosticians. Correlational data demonstrated high interexaminer reliabilities; however, there were gross errors in precision after raw scores had been converted into VMI age equivalent scores. (Author/RC)
Descriptors: Educational Diagnosis, Evaluation Methods, Grade Equivalent Scores, Motor Development
Roeber, Edward D. – 1996
This paper is based on guidelines developed in 1989 for training workshops for state and local educators to demonstrate the processes by which performance assessments could be created, validated, and used in statewide assessment programs. These guidelines are based on work with the National Assessment of Educational Progress and several statewide…
Descriptors: Evaluation Methods, Performance Based Assessment, Sampling, Scoring
Livingston, Samuel A. – 1983
Discussed are nine questions regarding standard setting issues in educational testing: (1) Should normative or content-referenced standards be used? (2) Different standard setting methods yield different results. Does this finding present a problem? (3) Assess the adequacy of the grounding of various methods of standard setting in psychological…
Descriptors: Educational Testing, Evaluation, Evaluation Methods, Measurement Objectives
Mather, Nancy; Wendling, Barbara J.; Woodcock, Richard W. – 2001
The widely used Woodcock Johnson (WJ) Test of Achievement has been separated into two distinct tests, Achievement and Cognitive Abilities. This book is designed to help busy mental health professionals acquire the knowledge and skills they need to use the third revision of the WJ Tests of Achievement (WJ III ACH) , including administration,…
Descriptors: Academic Achievement, Achievement Tests, Adults, Children

Direct link
