Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 5 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 10 |
Descriptor
Decision Making | 19 |
Evaluators | 19 |
Performance Based Assessment | 19 |
Standards | 8 |
Evaluation Methods | 7 |
Interrater Reliability | 5 |
Simulation | 5 |
Standard Setting (Scoring) | 5 |
English (Second Language) | 4 |
Foreign Countries | 4 |
Scoring | 4 |
More ▼ |
Source
Author
Wind, Stefanie A. | 3 |
Benson, John | 1 |
Berk, Ronald A. | 1 |
Brown, Michelle | 1 |
Brull, Harry | 1 |
Collins, Kathleen M. | 1 |
Dancza, Karina M. | 1 |
Delandshere, Ginette | 1 |
Eskin, Daniel | 1 |
Eva, Kevin | 1 |
Glennie, Miriam | 1 |
More ▼ |
Publication Type
Journal Articles | 17 |
Reports - Research | 9 |
Reports - Evaluative | 7 |
Information Analyses | 3 |
Speeches/Meeting Papers | 2 |
Dissertations/Theses -… | 1 |
Opinion Papers | 1 |
Reports - Descriptive | 1 |
Education Level
Higher Education | 6 |
Postsecondary Education | 5 |
High Schools | 1 |
Secondary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Yuichiro Yokouchi – Language Testing in Asia, 2025
The performance decision tree (PDT; Fulcher et al., 2011) is a rubric style that is applicable to performance assessment, with origins in Upshur and Turner's (1995) empirically derived binary-choice, boundary-definition (EBB) scale. It is easier for raters to assess performance by evaluating multiple binary-choice descriptors. Additionally,…
Descriptors: Scoring Rubrics, Second Language Learning, Second Language Instruction, Language Teachers
Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021
Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…
Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making
Wind, Stefanie A. – Language Testing, 2023
Researchers frequently evaluate rater judgments in performance assessments for evidence of differential rater functioning (DRF), which occurs when rater severity is systematically related to construct-irrelevant student characteristics after controlling for student achievement levels. However, researchers have observed that methods for detecting…
Descriptors: Evaluators, Decision Making, Student Characteristics, Performance Based Assessment
Peabody, Michael R.; Wind, Stefanie A. – Journal of Educational Measurement, 2019
Setting performance standards is a judgmental process involving human opinions and values as well as technical and empirical considerations. Although all cut score decisions are by nature somewhat arbitrary, they should not be capricious. Judges selected for standard-setting panels should have the proper qualifications to make the judgments asked…
Descriptors: Standard Setting, Decision Making, Performance Based Assessment, Evaluators
Eskin, Daniel – Studies in Applied Linguistics & TESOL, 2022
For agencies that deliver high-stakes Second Language (L2) proficiency exams, a research agenda has been undertaken for years to examine the role of rater, task, and rubric as sources of variability into their performance assessments (Lee, 2006; Sawaki & Sinharay, 2013; Xi, 2007; Xi & Mollaun, 2006). However, these challenges are more…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Student Placement
Glennie, Miriam; O'Donnell, Michael; Brown, Michelle; Benson, John – Research Evaluation, 2019
Evaluators play a central role in assessments of researchers' performance for reward, but the nature of their role and influence is not well understood. Ongoing reliance on evaluator judgement is typically justified as a need for referees in contests for reward, because quantitative performance measures alone can be subject to distortion. Yet, if…
Descriptors: Research Universities, Role, Evaluators, Focus Groups
Tan, Chin Pei; Howes, Dora; Tan, Rendell K. W.; Dancza, Karina M. – Assessment & Evaluation in Higher Education, 2022
Interactive oral assessments demonstrate potential to develop graduate attributes such as critical thinking, professional communication and collaborative skills in students through authentic simulation of workplace scenarios. This study captured the design, delivery and evaluation of interactive oral assessments across three programmes --…
Descriptors: Oral Language, Interaction, Critical Thinking, Communication Skills
Yeates, Peter; O'Neill, Paul; Mann, Karen; Eva, Kevin – Advances in Health Sciences Education, 2013
Assessors' scores in performance assessments are known to be highly variable. Attempted improvements through training or rating format have achieved minimal gains. The mechanisms that contribute to variability in assessors' scoring remain unclear. This study investigated these mechanisms. We used a qualitative approach to study…
Descriptors: Performance Based Assessment, Scores, Evaluators, Scoring
Kozaki, Yoko – Language Assessment Quarterly, 2010
This article describes an alternative approach to setting standards for performance assessments. The procedure was designed for use in low-budget, relatively low-stakes contexts where it is not possible to bring expert judges together. The procedure that allowed participant judges to work individually throughout the process was an effort to…
Descriptors: Performance Based Assessment, Standard Setting, Decision Making, Certification
Hsieh, Ching-Ni – ProQuest LLC, 2011
Second language (L2) oral performance assessment always involves raters' subjective judgments and is thus subject to rater variability. The variability due to rater characteristics has important consequential impacts on decision-making processes, particularly in high-stakes testing situations (Bachman, Lynch, & Mason, 1995; A. Brown, 1995;…
Descriptors: Undergraduate Students, Phonology, Teaching Assistants, Foreign Students

Lunz, Mary E.; And Others – Educational and Psychological Measurement, 1994
In a study involving eight judges, analysis with the FACETS model provides evidence that judges grade differently, whether or not scores correlate well. This outcome suggests that adjustments for differences among judges should be made before student measures are estimated to produce reproducible decisions. (SLD)
Descriptors: Correlation, Decision Making, Evaluation Methods, Evaluators

Berk, Ronald A. – Applied Measurement in Education, 1995
A brief summary of standard setting knowledge is presented, derived from about 20 methods that utilize a judgmental review process, the approach most relevant to the standard-setting strategies proposed in this special issue. Criteria for judging effectiveness and critiques of the methods discussed in the issue are offered. (SLD)
Descriptors: Criteria, Decision Making, Educational History, Evaluation Methods

Mills, Craig N. – Applied Measurement in Education, 1995
The articles of this special issue propose two methods of deriving an initial standard and one method for determining the extent to which the standard should include compensation. Much work remains to be done on further development of the methods and the larger issues of policy regarding performance assessment. (SLD)
Descriptors: Decision Making, Educational Policy, Evaluation Methods, Evaluators

Jaeger, Richard M. – Applied Measurement in Education, 1995
A performance-standard setting procedure termed judgmental policy capturing (JPC) and its application are described. A study involving 12 panelists demonstrated the feasibility of the JPC method for setting performance standards for classroom teachers seeking certification from the National Board for Professional Teaching Standards. (SLD)
Descriptors: Decision Making, Educational Assessment, Evaluation Methods, Evaluators

Plake, Barbara S. – Applied Measurement in Education, 1995
The three standard-setting approaches described in this special issue are summarized and contrasted: (1) judgmental policy capturing; (2) the extended Angoff method; and (3) the dominant profile method. An integrative summary of findings is followed by recommendations for modifying the methods. (SLD)
Descriptors: Decision Making, Elementary Secondary Education, Evaluation Methods, Evaluators
Previous Page | Next Page ยป
Pages: 1 | 2