ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	2

Descriptor

Evaluators	3
Performance Based Assessment	3
Item Response Theory	2
Scoring	2
Accuracy	1
Bias	1
Correlation	1
Data Collection	1
Decision Making	1
Evaluation Methods	1
Interrater Reliability	1
Judges	1
Maximum Likelihood Statistics	1
Models	1
Rating Scales	1
Writing Evaluation	1
Writing Tests	1
More ▼

Source

Educational and Psychological…

Author

Engelhard, George, Jr.	1
Guo, Wenjing	1
Lunz, Mary E.	1
Wang, Jue	1
Wind, Stefanie A.	1
Wolfe, Edward W.	1

Publication Type

Journal Articles	3
Reports - Research	2
Reports - Evaluative	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 3 results Save | Export

Exploring the Combined Effects of Rater Misfit and Differential Rater Functioning in Performance Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Guo, Wenjing – Educational and Psychological Measurement, 2019

Rater effects, or raters' tendencies to assign ratings to performances that are different from the ratings that the performances warranted, are well documented in rater-mediated assessments across a variety of disciplines. In many real-data studies of rater effects, researchers have reported that raters exhibit more than one effect, such as a…

Descriptors: Evaluators, Bias, Scoring, Data Collection

Evaluating Rater Accuracy in Rater-Mediated Assessments Using an Unfolding Model

Peer reviewed

Direct link

Wang, Jue; Engelhard, George, Jr.; Wolfe, Edward W. – Educational and Psychological Measurement, 2016

The number of performance assessments continues to increase around the world, and it is important to explore new methods for evaluating the quality of ratings obtained from raters. This study describes an unfolding model for examining rater accuracy. Accuracy is defined as the difference between observed and expert ratings. Dichotomous accuracy…

Descriptors: Evaluators, Accuracy, Performance Based Assessment, Models

Interjudge Reliability and Decision Reproducibility.

Peer reviewed

Lunz, Mary E.; And Others – Educational and Psychological Measurement, 1994

In a study involving eight judges, analysis with the FACETS model provides evidence that judges grade differently, whether or not scores correlate well. This outcome suggests that adjustments for differences among judges should be made before student measures are estimated to produce reproducible decisions. (SLD)

Descriptors: Correlation, Decision Making, Evaluation Methods, Evaluators