NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 228 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Matthew K. Burns; Heba Z. Abdelnaby; Jonie B. Welland; Katherine A. Graves; Kari Kurto – Assessment for Effective Intervention, 2024
The current study examined the reliability of The Reading League Curriculum-Evaluation Guidelines (CEGs), which were developed to help school-based teams rate the presence of red flags when considering adopting specific literacy curricula. Coders (n = 30) independently used the CEGs to evaluate a free online English language arts curriculum. The…
Descriptors: English Curriculum, English Instruction, Language Arts, Curriculum Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Jonas Flodén – British Educational Research Journal, 2025
This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…
Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Hulteen, Ryan M.; True, Larissa; Kroc, Edward – Measurement in Physical Education and Exercise Science, 2023
The typical process for assessing inter-rater reliability is facilitated by training raters within a research team. Lacking is an understanding if inter-rater reliability scores "between" research teams demonstrate adequate reliability. This study examined inter-rater reliability between 16 researchers who assessed fundamental motor…
Descriptors: Psychomotor Skills, Scores, Reliability, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024
Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…
Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Wahyu Nanda Eka Saputra; Trikinasih Handayani; Prima Suci Rohmadheny; Rohmatus Naini; Dody Hartanto; Hardi Santosa; Dewi Afra Khairunnisa; Risma Risansyah; Hanan Riati; Faturrahman – Journal of Education and Learning (EduLearn), 2025
The students are urged to do something without expecting anything in return and only in the name of God. Every islamic student becomes something ideal if they can internalize and implement sincerity. Many people are willing to do something because of an ulterior motive. The importance of sincerity in humans is the background for developing a…
Descriptors: Islam, Interrater Reliability, Prosocial Behavior, Muslims
Peer reviewed Peer reviewed
Direct linkDirect link
Schmidt, Ellyn M.; Rothenberg, W. Andrew; Davidson, Bridget C.; Barnett, Miya; Jent, Jason; Cadenas, Heleny; Fernandez, Corina; Davis, Eileen – Journal of Behavioral Education, 2023
Measuring classroom behavior among young children is important to guide assessment and intervention decisions, yet there is limited literature on appropriate direct observation tools for this purpose. This article describes the psychometric properties of the Behavior Assessment System for Children, Student Observation System (BASC-3 SOS) with 135…
Descriptors: Young Children, Special Education, Child Behavior, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Toma, Radu Bogdan – Technology, Knowledge and Learning, 2023
The development of computational thinking skills is attracting attention worldwide. The use of visual or block-based coding in primary schools has gained momentum. Yet, students' acceptance of such coding environments has been neglected in the literature. This study presents a measurement instrument that will allow pursuing such an endeavor. The…
Descriptors: Computation, Thinking Skills, Coding, Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Kaila L. Stipancic; Mojgan Golzy; Yunxin Zhao; Louise Pinkerton; Andrea Rohl; Mili Kuruvilla-Dugdale – Journal of Speech, Language, and Hearing Research, 2023
Purpose: Auditory training has been shown to reduce rater variability in perceptual voice assessment. Because rater variability is also a central issue in the auditory-perceptual assessment of dysarthria, this study sought to determine if training produces a meaningful change in rater reliability, criterion validity, and scaling magnitude of four…
Descriptors: Auditory Training, Auditory Perception, Program Effectiveness, Speech Impairments
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Shasha Chen; Shaohui Chi; Zuhao Wang – Journal of Baltic Science Education, 2025
Interdisciplinary thinking is critical for equipping students to apply scientific knowledge and tackle societal challenges across various disciplines, which has been recognized as a key objective of twenty-first century science education. However, research on effective interdisciplinary assessment in secondary school science education is still…
Descriptors: Thinking Skills, Interdisciplinary Approach, Science Instruction, Grade 7
Peer reviewed Peer reviewed
Direct linkDirect link
Pin, Tamis W.; So, Vincent K. K.; Siu, Cynthia S. H.; Yip, Sheila S. N.; Cheung, Stella See-wing; Kan, Jenny Yim-mui – Journal of Autism and Developmental Disorders, 2021
To examine reliability and validity of the new Social Motor Function Classification System for Children with Autism Spectrum Disorders (SMFCS-ASD). The SMFCS-ASD reliability was examined on 25 children (62.4 months SD 7.8) with ASD among six physical therapists. The validity study involved 1001 children (57.0 months, SD 9.9) with ASD using the…
Descriptors: Autism, Pervasive Developmental Disorders, Children, Classification
Peer reviewed Peer reviewed
Direct linkDirect link
Sas, Marlies; Snaphaan, Thom; Pauwels, Lieven J. R.; Ponnet, Koen; Hardyns, Wim – Field Methods, 2023
This study focuses on the use of systematic social observations (SSO) to measure crime prevention through environmental design (CPTED) and disorder. To improve knowledge about measurement issues in small area research, SSO is conducted by means of three different methods: in-situ, photographs, and Google Street View (GSV) imagery. By evaluating…
Descriptors: Crime Prevention, Measurement Techniques, Photography, Observation
Peer reviewed Peer reviewed
Direct linkDirect link
Leighton, Jacqueline P.; Lehman, Blair – Educational Measurement: Issues and Practice, 2020
In this digital ITEMS module, Dr. Jacqueline Leighton and Dr. Blair Lehman review differences between think-aloud interviews to measure problem-solving processes and cognitive labs to measure comprehension processes. Learners are introduced to historical, theoretical, and procedural differences between these methods and how to use and analyze…
Descriptors: Protocol Analysis, Interviews, Problem Solving, Cognitive Processes
Peer reviewed Peer reviewed
Direct linkDirect link
Davidow, Jason H.; Ye, Jun; Edge, Robin L. – International Journal of Language & Communication Disorders, 2023
Background: Speech-language pathologists often multitask in order to be efficient with their commonly large caseloads. In stuttering assessment, multitasking often involves collecting multiple measures simultaneously. Aims: The present study sought to determine reliability when collecting multiple measures simultaneously versus individually.…
Descriptors: Graduate Students, Measurement, Reliability, Group Activities
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Zepeda, Sally J.; Jimenez, Albert M. – Journal of Educational Supervision, 2019
Using a newly created teacher evaluation instrument, Inter-rater Reliability (IRR) analyses were conducted on four teacher videos as a means to establish instrument reliability. Raters included 42 principals and assistant principals in a southern US school district. The videos used spanned the teacher quality spectrum and the IRR findings across…
Descriptors: Teacher Evaluation, Interrater Reliability, Classroom Observation Techniques, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Brogan L. Barr; Virginia V. W. McIntosh; Eileen F. Britt; Jennifer Jordan; Janet D. Carter – Measurement: Interdisciplinary Research and Perspectives, 2024
Even when raters demonstrate agreement in the use of a measure, limited score variability or violation of often-ignored statistical assumptions can result in lower reliability estimates than intuitively expected. This article uses data drawn from two randomized controlled trials of schema therapy and cognitive behavioral therapy for the treatment…
Descriptors: Evaluators, Interrater Reliability, Reliability, Measurement Techniques
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  16