NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 6 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Morgan, Grant B.; Zhu, Min; Johnson, Robert L.; Hodge, Kari J. – Language Assessment Quarterly, 2014
Common estimators of interrater reliability include Pearson product-moment correlation coefficients, Spearman rank-order correlations, and the generalizability coefficient. The purpose of this study was to examine the accuracy of estimators of interrater reliability when varying the true reliability, number of scale categories, and number of…
Descriptors: Interrater Reliability, Correlation, Generalization, Scoring
Peer reviewed Peer reviewed
Penny, Jim; Johnson, Robert L.; Gordon, Belita – Journal of Experimental Education, 2000
Used an analytic rubric to score 120 writing samples from Georgia's 11th grade writing assessment. Raters augmented scores by adding a "+" or "-" to the score. Results indicate that this method of augmentation tends to improve most indices of interrater reliability, although the percentage of exact and adjacent agreement…
Descriptors: High School Students, High Schools, Interrater Reliability, Scoring Rubrics
Peer reviewed Peer reviewed
Johnson, Robert L.; Penny, James; Gordon, Belita – Applied Measurement in Education, 2000
Studied four forms of score resolution used by testing agencies and investigated the effect that each has on the interrater reliability associated with the resulting operational scores. Results, based on 120 essays from the Georgia High School Writing Test, show some forms of resolution to be associated with higher reliability and some associated…
Descriptors: Essay Tests, High School Students, High Schools, Interrater Reliability
Peer reviewed Peer reviewed
Johnson, Robert L.; McDaniel, Fred, II; Willeke, Marjorie J. – American Journal of Evaluation, 2000
Studied the interrater reliability of a portfolio assessment used in a small-scale program evaluation. Investigated analytic, combined analytic, and holistic family literacy portfolios from an Even Start program. Results show that at least three raters are needed to obtain acceptable levels of reliability for holistic and individual analytic…
Descriptors: Family Literacy, Holistic Approach, Interrater Reliability, Portfolio Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Johnson, Robert L.; Penny, Jim; Fisher, Steve; Kuhs, Therese – Applied Measurement in Education, 2003
When raters assign different scores to a performance task, a method for resolving rating differences is required to report a single score to the examinee. Recent studies indicate that decisions about examinees, such as pass/fail decisions, differ across resolution methods. Previous studies also investigated the interrater reliability of…
Descriptors: Test Reliability, Test Validity, Scores, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Johnson, Robert L.; Penny, James; Gordon, Belita; Shumate, Steven R.; Fisher, Steven P. – Language Assessment Quarterly, 2005
Many studies have indicated that at least 2 raters should score writing assessments to improve interrater reliability. However, even for assessments that characteristically demonstrate high levels of rater agreement, 2 raters of the same essay can occasionally report different, or discrepant, scores. If a single score, typically referred to as an…
Descriptors: Interrater Reliability, Scores, Evaluation, Reliability