NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers12
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 120 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Eser, Mehmet Taha; Aksu, Gökhan – International Journal of Curriculum and Instruction, 2022
The agreement between raters is examined within the scope of the concept of "inter-rater reliability". Although there are clear definitions of the concepts of agreement between raters and reliability between raters, there is no clear information about the conditions under which agreement and reliability level methods are appropriate to…
Descriptors: Generalizability Theory, Interrater Reliability, Evaluation Methods, Test Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Anthony, Christopher J.; Styck, Kara M.; Volpe, Robert J.; Robert, Christopher R. – School Psychology, 2023
Although originally conceived of as a marriage of direct behavioral observation and indirect behavior rating scales, recent research has indicated that Direct Behavior Ratings (DBRs) are affected by rater idiosyncrasies (rater effects) similar to other indirect forms of behavioral assessment. Most of this research has been conducted using…
Descriptors: Item Response Theory, Generalizability Theory, Interrater Reliability, Behavior Rating Scales
Peer reviewed Peer reviewed
Direct linkDirect link
Johnson, Evelyn S.; Zheng, Yuzhu; Crawford, Angela R.; Moylan, Laura A. – Journal of Experimental Education, 2022
In this study, we examined the scoring and generalizability assumptions of an explicit instruction (EI) special education teacher observation protocol using many-faceted Rasch measurement (MFRM). Video observations of classroom instruction from 48 special education teachers across four states were collected. External raters (n = 20) were trained…
Descriptors: Direct Instruction, Teacher Education, Classroom Observation Techniques, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Bimpeh, Yaw; Pointer, William; Smith, Ben Alexander; Harrison, Liz – Applied Measurement in Education, 2020
Many high-stakes examinations in the United Kingdom (UK) use both constructed-response items and selected-response items. We need to evaluate the inter-rater reliability for constructed-response items that are scored by humans. While there are a variety of methods for evaluating rater consistency across ratings in the psychometric literature, we…
Descriptors: Scoring, Generalizability Theory, Interrater Reliability, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Weston, Timothy J.; Hayward, Charles N.; Laursen, Sandra L. – American Journal of Evaluation, 2021
Observations are widely used in research and evaluation to characterize teaching and learning activities. Because conducting observations is typically resource intensive, it is important that inferences from observation data are made confidently. While attention focuses on interrater reliability, the reliability of a single-class measure over the…
Descriptors: Generalizability Theory, Observation, Inferences, Social Science Research
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Jones, Eli – Educational Researcher, 2019
Teacher evaluation systems often include classroom observations in which raters use rating scales to evaluate teachers' effectiveness. Recently, researchers have promoted the use of multifaceted approaches to investigating reliability using Generalizability theory, instead of rater reliability statistics. Generalizability theory allows analysts to…
Descriptors: Teacher Evaluation, Observation, Generalizability Theory, Item Response Theory
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Atilgan, Hakan; Demir, Elif Kübra; Ogretmen, Tuncay; Basokcu, Tahsin Oguz – International Journal of Progressive Education, 2020
It has become a critical question what the reliability level would be when open-ended questions are used in large-scale selection tests. One of the aims of the present study is to determine what the reliability would be in the event that the answers given by test-takers are scored by experts when open-ended short answer questions are used in…
Descriptors: Foreign Countries, Secondary School Students, Test Items, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Clain, Alex E.; Alkhuwaiter, Munirah; Davidson, Kate; Martin-Harris, Bonnie – Journal of Speech, Language, and Hearing Research, 2022
Purpose: The purpose of this study was to extend the assessment of the psychometric properties of the Modified Barium Swallow Impairment Profile (MBSImP). Here, we re-examined structural validity and internal consistency using a large clinical-registry data set and formally examined rater reliability in a smaller data set. Method: This study…
Descriptors: Diagnostic Tests, Disability Identification, Physical Disabilities, Eating Disorders
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Atilgan, Hakan – Eurasian Journal of Educational Research, 2019
Purpose: This study intended to examine the generalizability and reliability of essay ratings within the scope of the generalizability (G) theory. Specifically, the effect of raters on the generalizability and reliability of students' essay ratings was examined. Furthermore, variations of the generalizability and reliability coefficients with…
Descriptors: Foreign Countries, Essay Tests, Test Reliability, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
D'Agostino, Jerome V.; Rodgers, Emily; Winkler, Christa; Johnson, Tracy; Berenbon, Rebecca – Reading Psychology, 2021
Running Records provide a standardized method for recording and assessing students' oral reading behaviors and are excellent formative assessment tools to guide instructional decision-making. This study expands on prior Running Record reliability work by evaluating the extent to which external raters and teachers consistently assessed students'…
Descriptors: Accuracy, Oral Reading, Generalizability Theory, Error Correction
Johnson, Evelyn S.; Zheng, Yuzhu; Crawford, Angela R.; Moylan, Laura A. – Grantee Submission, 2020
In this study, we examined the scoring and generalizability assumptions of an Explicit Instruction (EI) special education teacher observation protocol using many-faceted Rasch measurement (MFRM). Video observations of classroom instruction from 48 special education teachers across four states were collected. External raters (n = 20) were trained…
Descriptors: Direct Instruction, Teacher Evaluation, Classroom Observation Techniques, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Roduta Roberts, Mary; Alves, Cecilia Brito; Werther, Karin; Bahry, Louise M. – Journal of Psychoeducational Assessment, 2019
The purpose of this study was to examine the reliability and sources of score variation from a performance assessment of practice competencies within an occupational therapy program. Data from 99 students who participated in a practical exam were examined. A generalizability analysis of analytic, total, and overall holistic scores was completed…
Descriptors: Performance Based Assessment, Test Reliability, Scores, Occupational Therapy
Mantzicopoulos, Panayota; French, Brian F.; Patrick, Helen – Grantee Submission, 2018
Research Findings: We evaluated the score stability of the Mathematical Quality of Instruction (MQI), an observational measure of mathematics instruction. Three raters each scored, independently, 100 video-recorded lessons taught by 20 kindergarten teachers in the spring. Using generalizability theory analyses, we decomposed the MQI's score…
Descriptors: Kindergarten, Mathematics Instruction, Educational Quality, Classroom Observation Techniques
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Uzun, N. Bilge; Alici, Devrim; Aktas, Mehtap – European Journal of Educational Research, 2019
The purpose of study is to examine the reliability of analytical rubrics and checklists developed for the assessment of story writing skills by means of generalizability theory. The study group consisted of 52 students attending the 5th grade at primary school and 20 raters in Mersin University. The G study was carried out with the fully crossed…
Descriptors: Foreign Countries, Scoring Rubrics, Check Lists, Writing Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Mantzicopoulos, Panayota; French, Brian F.; Patrick, Helen – Early Education and Development, 2018
Research Findings: We evaluated the score stability of the Mathematical Quality of Instruction (MQI), an observational measure of mathematics instruction. Three raters each scored, independently, 100 video-recorded lessons taught by 20 kindergarten teachers in the spring. Using generalizability theory analyses, we decomposed the MQI's score…
Descriptors: Kindergarten, Mathematics Instruction, Educational Quality, Classroom Observation Techniques
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8