Publication Date
In 2025 | 1 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 24 |
Since 2016 (last 10 years) | 39 |
Since 2006 (last 20 years) | 72 |
Descriptor
Evaluators | 129 |
Performance Based Assessment | 129 |
Evaluation Methods | 52 |
Scoring | 32 |
Interrater Reliability | 31 |
Teacher Evaluation | 26 |
Student Evaluation | 20 |
Decision Making | 19 |
Feedback (Response) | 19 |
Elementary Secondary Education | 18 |
Standards | 18 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Practitioners | 1 |
Teachers | 1 |
Location
Australia | 4 |
California | 4 |
Japan | 3 |
Georgia | 2 |
Louisiana | 2 |
Massachusetts (Boston) | 2 |
Colombia | 1 |
Florida | 1 |
Germany | 1 |
Illinois | 1 |
Illinois (Chicago) | 1 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 1 |
Elementary and Secondary… | 1 |
Government Performance and… | 1 |
Higher Education Act 1965 | 1 |
Assessments and Surveys
edTPA (Teacher Performance… | 2 |
General Educational… | 1 |
Praxis Series | 1 |
Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Jessica Thomas – ProQuest LLC, 2024
The purpose of this quantitative correlational study was to determine if, and to what extent, a relationship existed between teachers' perceptions of their evaluator's instructional leadership and the teachers' evaluation scores at the middle and high school levels in one Florida school district. Ozcan's theory of teacher motivation states that…
Descriptors: Secondary School Teachers, Teacher Attitudes, Teacher Evaluation, Evaluators
Mascadri, Julia; Spina, Nerida; Spooner-Lane, Rebecca; Briant, Elizabeth – Assessment & Evaluation in Higher Education, 2023
Australia has recently implemented Teaching Performance Assessments (TPAs) as a national accreditation requirement to assess final year preservice teachers' classroom readiness. In 2019, an Australian university developed a TPA to meet this requirement, comprising three written components and one oral component. This exploratory study investigated…
Descriptors: Foreign Countries, Evaluators, Oral Language, Performance Based Assessment
Alexander Rushforth; Sarah De Rijcke – Research Evaluation, 2024
Recent times have seen the growth in the number and scope of interacting professional reform movements in science, centered on themes such as open research, research integrity, responsible research assessment, and responsible metrics. The responsible metrics movement identifies the growing influence of quantitative performance indicators as a…
Descriptors: College Faculty, Teacher Selection, Faculty Promotion, Tenure
Jin, Kuan-Yu; Eckes, Thomas – Measurement: Interdisciplinary Research and Perspectives, 2022
Recent research on rater effects in performance assessments has increasingly focused on rater centrality, the tendency to assign scores clustering around the rating scale's middle categories. In the present paper, we adopted Jin and Wang's (2018) extended facets modeling approach and constructed a centrality continuum, ranging from raters…
Descriptors: Performance Based Assessment, Evaluators, Scoring, Sample Size
Wind, Stefanie A. – Measurement: Interdisciplinary Research and Perspectives, 2022
In many performance assessments, one or two raters from the complete rater pool scores each performance, resulting in a sparse rating design, where there are limited observations of each rater relative to the complete sample of students. Although sparse rating designs can be constructed to facilitate estimation of student achievement, the…
Descriptors: Evaluators, Bias, Identification, Performance Based Assessment
Wind, Stefanie A. – Educational Measurement: Issues and Practice, 2020
Researchers have documented the impact of rater effects, or raters' tendencies to give different ratings than would be expected given examinee achievement levels, in performance assessments. However, the degree to which rater effects influence person fit, or the reasonableness of test-takers' achievement estimates given their response patterns,…
Descriptors: Performance Based Assessment, Evaluators, Achievement, Influences
Wong, Wai Yee Amy; Thistlethwaite, Jill; Moni, Karen; Roberts, Chris – Advances in Health Sciences Education, 2023
Examiners' judgements play a critical role in competency-based assessments such as objective structured clinical examinations (OSCEs). The standardised nature of OSCEs and their alignment with regulatory accountability assure their wide use as high-stakes assessment in medical education. Research into examiner behaviours has predominantly explored…
Descriptors: Sociocultural Patterns, Evaluators, Performance Based Assessment, Accountability
Klusmann, Dietrich; Knorr, Mirjana; Hampe, Wolfgang – Advances in Health Sciences Education, 2023
The phenomenon of first impression is well researched in social psychology, but less so in the study of OSCEs and the multiple mini interview (MMI). To explore its bearing on the MMI method we included a rating of first impression in the MMI for student selection executed 2012 at the University Medical Center Hamburg-Eppendorf, Germany (196…
Descriptors: Foreign Countries, Medical Students, Admission Criteria, Interviews
Huang, Jing; Chen, Gaowei – AERA Online Paper Repository, 2019
This research investigates the effects of rater experience on performance ratings in language testing using a systematic review of studies published from 1985 to 2017. Based on a comprehensive literature search of 14 databases, we identified sixteen relevant papers. With these we conducted a narrative review to conceptualize a theoretical…
Descriptors: Language Tests, Experience, Evaluators, Performance Based Assessment
Tanaka, Mitsuko; Ross, Steven J. – Assessment in Education: Principles, Policy & Practice, 2023
Raters vary from each other in their severity and leniency in rating performance. This study examined the factors affecting rater severity in peer assessments of oral presentations in English as a Foreign Language (EFL), focusing on peer raters' self-construal and presentation abilities. Japanese university students enrolled in EFL classes…
Descriptors: Evaluators, Interrater Reliability, Item Response Theory, Peer Evaluation
Yuichiro Yokouchi – Language Testing in Asia, 2025
The performance decision tree (PDT; Fulcher et al., 2011) is a rubric style that is applicable to performance assessment, with origins in Upshur and Turner's (1995) empirically derived binary-choice, boundary-definition (EBB) scale. It is easier for raters to assess performance by evaluating multiple binary-choice descriptors. Additionally,…
Descriptors: Scoring Rubrics, Second Language Learning, Second Language Instruction, Language Teachers
Wind, Stefanie A.; Guo, Wenjing – Educational and Psychological Measurement, 2019
Rater effects, or raters' tendencies to assign ratings to performances that are different from the ratings that the performances warranted, are well documented in rater-mediated assessments across a variety of disciplines. In many real-data studies of rater effects, researchers have reported that raters exhibit more than one effect, such as a…
Descriptors: Evaluators, Bias, Scoring, Data Collection
Edwards, Chad; Edwards, Autumn; Albrehi, Fatima; Spence, Patric – Communication Education, 2021
Extending previous research on the Computers Are Social Actors paradigm and the human-to-human interaction script, this study examines the interpersonal impressions of a social robot evaluator versus a human evaluator in a performance evaluation context. A between-subjects experiment was conducted to measure participants' impressions of the…
Descriptors: Robotics, Man Machine Systems, Performance Based Assessment, Task Analysis
Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021
Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…
Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making
Kevin Ward – ProQuest LLC, 2022
The study established the validity and reliability of a weighted individual performance-based assessment tool within the utility scope of middle school orchestral strings. The following research questions guided this study: 1. What specific string-playing behaviors and corresponding criteria validate a weighted individual performance-based…
Descriptors: Music Education, Musical Instruments, Psychometrics, Music