Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 9 |
Descriptor
Evaluation Methods | 11 |
Generalizability Theory | 11 |
Scores | 11 |
Reliability | 6 |
Rating Scales | 4 |
Error of Measurement | 3 |
Evaluators | 3 |
Interrater Reliability | 3 |
Scoring | 3 |
Behavior Problems | 2 |
Elementary School Students | 2 |
More ▼ |
Source
Author
Abedi, Jamal | 1 |
Algina, James | 1 |
Attali, Yigal | 1 |
Bergeron, Renee | 1 |
Evans, Nancy J. | 1 |
Farmer, William L. | 1 |
Floyd, Randy G. | 1 |
Forney, Deanna S. | 1 |
Hathcoat, John D. | 1 |
Hintze, John M. | 1 |
Kantor, Robert | 1 |
More ▼ |
Publication Type
Journal Articles | 11 |
Reports - Research | 10 |
Reports - Descriptive | 1 |
Education Level
Elementary Education | 2 |
Higher Education | 2 |
Early Childhood Education | 1 |
Postsecondary Education | 1 |
Preschool Education | 1 |
Audience
Location
Oklahoma | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Behavior Assessment System… | 1 |
Myers Briggs Type Indicator | 1 |
Teacher Rating Scale | 1 |
What Works Clearinghouse Rating
McLaughlin, Tara W.; Snyder, Patricia A.; Algina, James – Grantee Submission, 2017
The Learning Target Rating Scale (LTRS) is a measure designed to evaluate the quality of teacher-developed learning targets for embedded instruction for early learning. In the present study, we examined the measurement dependability of LTRS scores by conducting a generalizability study (G-study). We used a partially nested, three-facet model to…
Descriptors: Generalizability Theory, Scores, Rating Scales, Evaluation Methods
Lin, Chih-Kai – Language Testing, 2017
Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…
Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy
Zhang, Bo; Xiao, Yunnan; Luo, Juan – Language Testing in Asia, 2015
Previous studies comparing holistic scoring to analytic scoring of second language writing have given mixed results. Some of them suffer from methodological drawbacks, such as limited writing sample size, limited number of raters, and lack of direct comparison of the two methods. Based on 300 writing samples graded by 14 raters, this research…
Descriptors: Evaluators, Reliability, Scores, Holistic Approach
Attali, Yigal – Educational and Psychological Measurement, 2014
This article presents a comparative judgment approach for holistically scored constructed response tasks. In this approach, the grader rank orders (rather than rate) the quality of a small set of responses. A prior automated evaluation of responses guides both set formation and scaling of rankings. Sets are formed to have similar prior scores and…
Descriptors: Responses, Item Response Theory, Scores, Rating Scales
Hathcoat, John D.; Penn, Jeremy D. – Research & Practice in Assessment, 2012
Critics of standardized testing have recommended replacing standardized tests with more authentic assessment measures, such as classroom assignments, projects, or portfolios rated by a panel of raters using common rubrics. Little research has examined the consistency of scores across multiple authentic assignments or the implications of this…
Descriptors: Generalizability Theory, Performance Based Assessment, Writing Across the Curriculum, Standardized Tests
Volpe, Robert J.; McConaughy, Stephanie H.; Hintze, John M. – School Psychology Review, 2009
The present study used generalizability theory to investigate the dependability of systematic observations of students' problem behavior and on-task behavior in classrooms. The Direct Observation Form (McConaughy & Achenbach, 2009) was used with a sample of 24, 6- to-11-year-old children attending 18 different elementary schools. The participants…
Descriptors: Generalizability Theory, Behavior Problems, Student Behavior, Evaluation Methods
Bergeron, Renee; Floyd, Randy G.; McCormack, Allison C.; Farmer, William L. – School Psychology Review, 2008
The dependability of externalizing behavior composites and subscale scores from the Behavior Assessment System for Children, Second Edition, Teacher Rating Scale-Child (Reynolds & Kamphaus, 2004) and the Achenbach System of Empirically Based Assessment, Teacher's Report Form for Ages 6-18 (Achenbach & Rescorla, 2001) was investigated.…
Descriptors: Generalizability Theory, Scores, Rating Scales, Error of Measurement
Lee, Yong-Won; Kantor, Robert – International Journal of Testing, 2007
Possible integrated and independent tasks were pilot tested for the writing section of a new generation of the TOEFL[R] (Test of English as a Foreign Language[TM]). This study examines the impact of various rating designs and of the number of tasks and raters on the reliability of writing scores based on integrated and independent tasks from the…
Descriptors: Generalizability Theory, Writing Tests, English (Second Language), Second Language Learning
Kretchmar, Jennifer – Journal of College Admission, 2006
Many colleges and universities receive thousands of applications for freshman admission every year. To facilitate the process of evaluating each and every applicant in a relatively short amount of time, schools often devise quantitative ratings scales to summarize student characteristics. The ratings give readers a shorthand way to communicate the…
Descriptors: Generalizability Theory, Reliability, College Admission, College Applicants

Salter, Daniel W.; Forney, Deanna S.; Evans, Nancy J. – Measurement and Evaluation in Counseling and Development, 2005
In this study, two approaches are used to assess the stability of Myers-Briggs Type Indicator scores across 3 administrations (N = 231): longitudinal configural frequency analysis with categorical scores and generalizability theory with the Preference Clarity Indices and continuous scores. The results are generally positive. Evaluation of…
Descriptors: Psychology, Cognitive Style, Generalizability Theory, Personality Traits

Abedi, Jamal – Multivariate Behavioral Research, 1996
The Interrater/Test Reliability System (ITRS) is described. The ITRS is a comprehensive computer tool used to address questions of interrater reliability that computes several different indices of interrater reliability and the generalizability coefficient over raters and topics. The system is available in IBM compatible or Macintosh format. (SLD)
Descriptors: Computer Software, Computer Software Evaluation, Evaluation Methods, Evaluators