Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 5 |
Descriptor
Generalizability Theory | 10 |
Scoring | 10 |
Test Reliability | 10 |
Interrater Reliability | 9 |
Scores | 4 |
Writing Evaluation | 4 |
Error of Measurement | 3 |
Writing Tests | 3 |
Analysis of Variance | 2 |
Essay Tests | 2 |
Evaluation Methods | 2 |
More ▼ |
Source
Assessing Writing | 1 |
European Journal of… | 1 |
International Journal of… | 1 |
Online Submission | 1 |
Reading Psychology | 1 |
Author
Aksu, Gökhan | 1 |
Aktas, Mehtap | 1 |
Alici, Devrim | 1 |
Badjadi, Nour El Imane | 1 |
Capie, William | 1 |
Crehan, Kevin D. | 1 |
Cronin, Linda L. | 1 |
Eser, Mehmet Taha | 1 |
Gebril, Atta | 1 |
Morrison, Timothy G. | 1 |
Moser, Gary P. | 1 |
More ▼ |
Publication Type
Reports - Research | 6 |
Journal Articles | 4 |
Reports - Evaluative | 4 |
Speeches/Meeting Papers | 4 |
Information Analyses | 1 |
Education Level
Elementary Education | 1 |
Grade 4 | 1 |
Audience
Researchers | 2 |
Location
Turkey | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Teacher Performance… | 1 |
What Works Clearinghouse Rating
Comparison of the Results of the Generalizability Theory with the Inter-Rater Agreement Coefficients
Eser, Mehmet Taha; Aksu, Gökhan – International Journal of Curriculum and Instruction, 2022
The agreement between raters is examined within the scope of the concept of "inter-rater reliability". Although there are clear definitions of the concepts of agreement between raters and reliability between raters, there is no clear information about the conditions under which agreement and reliability level methods are appropriate to…
Descriptors: Generalizability Theory, Interrater Reliability, Evaluation Methods, Test Theory
Uzun, N. Bilge; Alici, Devrim; Aktas, Mehtap – European Journal of Educational Research, 2019
The purpose of study is to examine the reliability of analytical rubrics and checklists developed for the assessment of story writing skills by means of generalizability theory. The study group consisted of 52 students attending the 5th grade at primary school and 20 raters in Mersin University. The G study was carried out with the fully crossed…
Descriptors: Foreign Countries, Scoring Rubrics, Check Lists, Writing Tests
Moser, Gary P.; Sudweeks, Richard R.; Morrison, Timothy G.; Wilcox, Brad – Reading Psychology, 2014
This study examined ratings of fourth graders' oral reading expression. Randomly assigned participants (n = 36) practiced repeated readings using narrative or informational passages for 7 weeks. After this period raters used the "Multidimensional Fluency Scale" (MFS) on two separate occasions to rate students' expressive…
Descriptors: Elementary School Students, Oral Reading, Reading Skills, Suprasegmentals
Badjadi, Nour El Imane – Online Submission, 2013
The current paper on writing assessment surveys the literature on the reliability and validity of essay tests. The paper aims to examine the two concepts in relationship with essay testing as well as to provide a snapshot of the current understandings of the reliability and validity of essay tests as drawn in recent research studies. Bearing in…
Descriptors: Essay Tests, Writing Evaluation, Test Validity, Test Reliability
Gebril, Atta – Assessing Writing, 2010
Integrated tasks are currently employed in a number of L2 exams since they are perceived as an addition to the writing-only task type. Given this trend, the current study investigates composite score generalizability of both reading-to-write and writing-only tasks. For this purpose, a multivariate generalizability analysis is used to investigate…
Descriptors: Scoring, Scores, Second Language Instruction, Writing Evaluation
Crehan, Kevin D. – 1997
Writing fits well within the realm of outcomes suitable for observation by performance assessments. Studies of the reliability of performance assessments have suggested that interrater reliability can be consistently high. Scoring consistency, however, is only one aspect of quality in decisions based on assessment results. Another is…
Descriptors: Evaluation Methods, Feedback, Generalizability Theory, Interrater Reliability
Shale, Doug – 1986
This study is an attempt at a cohesive characterization of the concept of essay reliability. As such, it takes as a basic premise that previous and current practices in reporting reliability estimates for essay tests have certain shortcomings. The study provides an analysis of these shortcomings--partly to encourage a fuller understanding of the…
Descriptors: Analysis of Variance, Correlation, Error of Measurement, Essay Tests
Cronin, Linda L.; Capie, William – 1985
The purpose of this study was to compare the scoring of Teacher Performance Assessment Instruments (TPAI) indicators using discrete descriptors when some are considered "essential" with the scoring of these same indicators, and when no descriptors are considered essential. The two questions addressed in this study were: (1) To what…
Descriptors: Analysis of Variance, Behavior Rating Scales, Classroom Observation Techniques, Data Collection
Wolfe, Edward W. – 1996
Although portfolio assessment is becoming increasingly popular, it may not survive unless portfolio scoring can meet the demands of large-scale assessment standards. The results of studies of interrater reliability with large-scale portfolio assessments have been mixed. This paper reports the scoring results of a nationwide portfolio pilot in…
Descriptors: Decision Making, Generalizability Theory, Interrater Reliability, Language Arts
Shavelson, Richard J.; And Others – 1993
In this paper, performance assessments are cast within a sampling framework. A performance assessment score is viewed as a sample of student performance drawn from a complex universe defined by a combination of all possible tasks, occasions, raters, and measurement methods. Using generalizability theory, the authors present evidence bearing on the…
Descriptors: Academic Achievement, Educational Assessment, Error of Measurement, Evaluators