NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of…2
What Works Clearinghouse Rating
Showing all 11 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Brogan L. Barr; Virginia V. W. McIntosh; Eileen F. Britt; Jennifer Jordan; Janet D. Carter – Measurement: Interdisciplinary Research and Perspectives, 2024
Even when raters demonstrate agreement in the use of a measure, limited score variability or violation of often-ignored statistical assumptions can result in lower reliability estimates than intuitively expected. This article uses data drawn from two randomized controlled trials of schema therapy and cognitive behavioral therapy for the treatment…
Descriptors: Evaluators, Interrater Reliability, Reliability, Measurement Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Zhang, Xiuyuan – AERA Online Paper Repository, 2019
The main purpose of the study is to evaluate the qualities of human essay ratings for a large-scale assessment using Rasch measurement theory. Specifically, Many-Facet Rasch Measurement (MFRM) was utilized to examine the rating scale category structure and provide important information about interpretations of ratings in the large-scale…
Descriptors: Essays, Evaluators, Writing Evaluation, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Cohen, Matthew L.; Tulsky, David S.; Boulton, Aaron J.; Kisala, Pamela A.; Bertisch, Hilary; Yeates, Keith Owen; Zonfrillo, Mark R.; Durbin, Dennis R.; Jaffe, Kenneth M.; Temkin, Nancy; Wang, Jin; Rivara, Frederick P. – Journal of Speech, Language, and Hearing Research, 2019
Purpose: The purpose of this study was to evaluate the internal consistency and construct validity of the Traumatic Brain Injury Quality of Life Communication Item Bank (TBI-QOL COM) short form as a parent-proxy report measure. The TBI-QOL COM is a patient-reported outcome measure of functional communication originally developed as a self-report…
Descriptors: Brain, Head Injuries, Quality of Life, Pediatrics
Peer reviewed Peer reviewed
Direct linkDirect link
Hushman, Glenn; Hushman, Carolyn; Carbonneau, Kira – Physical Educator, 2015
The current educational reform movement in the United States is focused on measuring the effectiveness of teachers. One component of teacher effectiveness is student achievement. The effectiveness of using PE Metrics as a measure of student achievement in a physical activity setting with a low socioeconomic, culturally diverse population was…
Descriptors: Educational Change, Physical Education, Teacher Effectiveness, Physical Activities
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Riordan, Julie; Shakman, Karen; Chang, Quincy; Lacireno-Paquet, Natalie; Bocala, Candice – Regional Educational Laboratory Northeast & Islands, 2015
This "Stated Briefly" report is a companion piece that summarizes the results of another report of the same name. REL Northeast and Islands, in collaboration with the Northeast Educator Effectiveness Research Alliance and the New Hampshire Department of Education conducted a study of the implementation of new teacher evaluation systems…
Descriptors: Teacher Evaluation, Evaluation Methods, Standards, School Districts
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Riordan, Julie; Lacireno-Paquet, Natalie; Shakman, Karen; Bocala, Candice; Chang, Quincy – Regional Educational Laboratory Northeast & Islands, 2015
REL Northeast and Islands, in collaboration with the Northeast Educator Effectiveness Research Alliance and the New Hampshire Department of Education, conducted a study of the implementation of new teacher evaluation systems in New Hampshire's School Improvement Grant (SIG) schools. While the basic system features are similar across district…
Descriptors: Teacher Evaluation, Evaluation Methods, Standards, School Districts
Linacre, John M. – 1993
Generalizability theory (G-theory) and many-facet Rasch measurement (Rasch) manage the variability inherent when raters rate examinees on test items. The purpose of G-theory is to estimate test reliability in a raw score metric. Unadjusted examinee raw scores are reported as measures. A variance component is estimated for the examinee…
Descriptors: Comparative Analysis, Equations (Mathematics), Estimation (Mathematics), Evaluators
Kaplan, Bruce A.; Johnson, Eugene G. – 1992
Across the field of educational assessment the case has been made for alternatives to the multiple-choice item type. Most of the alternative types of items require a subjective evaluation by a rater. The reliability of this subjective rating is a key component of these types of alternative items. In this paper, measures of reliability are…
Descriptors: Educational Assessment, Elementary Secondary Education, Estimation (Mathematics), Evaluators
General Accounting Office, Washington, DC. Program Evaluation and Methodology Div. – 1993
In September 1991, the National Assessment Governing Board (NAGB) announced standards for basic, proficient, and advanced achievement in mathematics and reported that few American students had reached these standards. Expert reviewers noted technical problems with the NAGB approach and questioned its results. In this report, the NAGB…
Descriptors: Academic Achievement, Academic Standards, Educational Policy, Elementary Secondary Education
University of South Florida, Tampa. – 1986
The Teacher Evaluation and Assessment Center (TEAC) was established by the Department of Education at the University of South Florida in 1984 to serve the state in the certification of trainers and observers of the Florida Performance Measurement System (FPMS) and to score and report performance evaluations for special programs. This report…
Descriptors: Beginning Teachers, Certification, Classroom Observation Techniques, Elementary Secondary Education
Shavelson, Richard J.; And Others – 1993
In this paper, performance assessments are cast within a sampling framework. A performance assessment score is viewed as a sample of student performance drawn from a complex universe defined by a combination of all possible tasks, occasions, raters, and measurement methods. Using generalizability theory, the authors present evidence bearing on the…
Descriptors: Academic Achievement, Educational Assessment, Error of Measurement, Evaluators