ERIC - Search Results

Publication Date

In 2026	0
Since 2025	5
Since 2022 (last 5 years)	23
Since 2017 (last 10 years)	57
Since 2007 (last 20 years)	187

Descriptor

Interrater Reliability	230
Reliability	230
Validity	90
Foreign Countries	53
Scores	48
Correlation	47
Measures (Individuals)	38
Statistical Analysis	34
Evaluation Methods	33
Comparative Analysis	27
Observation	27
Psychometrics	26
Rating Scales	23
Student Evaluation	23
Scoring Rubrics	22
Children	21
Evaluators	21
Measurement Techniques	20
Teaching Methods	19
Factor Analysis	18
Intervention	18
Academic Achievement	17
Scoring	17
College Students	16
Construct Validity	16
More ▼

Publication Type

Journal Articles	194
Reports - Research	166
Reports - Evaluative	34
Speeches/Meeting Papers	13
Dissertations/Theses -…	12
Information Analyses	11
Tests/Questionnaires	9
Reports - Descriptive	8
Opinion Papers	6
Guides - Non-Classroom	2
Books	1
Non-Print Media	1
Numerical/Quantitative Data	1
More ▼

Education Level

Higher Education	51
Postsecondary Education	40
Elementary Education	24
Secondary Education	15
Early Childhood Education	13
High Schools	8
Elementary Secondary Education	7
Kindergarten	7
Middle Schools	7
Primary Education	7
Junior High Schools	6
Preschool Education	6
Grade 2	4
Grade 1	3
Grade 3	3
Grade 4	3
Grade 6	3
Intermediate Grades	3
Grade 5	2
Grade 7	1
Grade 8	1
Two Year Colleges	1
More ▼

Audience

Researchers	13
Practitioners	2
Administrators	1
Counselors	1
Policymakers	1

Location

Canada	7
Turkey	6
Australia	5
United States	5
Netherlands	4
Taiwan	4
California	3
China	3
Italy	3
Norway	3
Spain	3
Belgium	2
Finland	2
Indonesia	2
New York	2
North Carolina	2
Singapore	2
Thailand	2
United Kingdom	2
United Kingdom (England)	2
Argentina	1
Bahrain	1
Brazil	1
California (Berkeley)	1
China (Beijing)	1
More ▼

Laws, Policies, & Programs

What Works Clearinghouse Rating

Showing 1 to 15 of 230 results Save | Export

How Consistent Are Humans When Grading Programming Assignments?

Peer reviewed

Direct link

Marcus Messer; Neil C. C. Brown; Michael Kölling; Miaojing Shi – ACM Transactions on Computing Education, 2025

Providing consistent summative assessment to students is important, as the grades they are awarded affect their progression through university and future career prospects. While small cohorts are typically assessed by a single assessor, such as the module/class leader, larger cohorts are often assessed by multiple assessors, typically teaching…

Descriptors: Foreign Countries, Grading, Interrater Reliability, Teaching Assistants

Grading Exams Using Large Language Models: A Comparison between Human and AI Grading of Exams in Higher Education Using ChatGPT

Peer reviewed

Direct link

Jonas Flodén – British Educational Research Journal, 2025

This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…

Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring

Reliability of Ratings of an English Language Arts Curriculum with the Curriculum Evaluation Guidelines

Peer reviewed

Direct link

Matthew K. Burns; Heba Z. Abdelnaby; Jonie B. Welland; Katherine A. Graves; Kari Kurto – Assessment for Effective Intervention, 2024

The current study examined the reliability of The Reading League Curriculum-Evaluation Guidelines (CEGs), which were developed to help school-based teams rate the presence of red flags when considering adopting specific literacy curricula. Coders (n = 30) independently used the CEGs to evaluate a free online English language arts curriculum. The…

Descriptors: English Curriculum, English Instruction, Language Arts, Curriculum Evaluation

New Tests of Rater Drift in Trend Scoring

Peer reviewed

Direct link

John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024

Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…

Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics

Trust the "Process"? When Fundamental Motor Skill Scores Are Reliably Unreliable

Peer reviewed

Direct link

Hulteen, Ryan M.; True, Larissa; Kroc, Edward – Measurement in Physical Education and Exercise Science, 2023

The typical process for assessing inter-rater reliability is facilitated by training raters within a research team. Lacking is an understanding if inter-rater reliability scores "between" research teams demonstrate adequate reliability. This study examined inter-rater reliability between 16 researchers who assessed fundamental motor…

Descriptors: Psychomotor Skills, Scores, Reliability, Interrater Reliability

The Scale of Sincerity Based on Kyai Haji Ahmad Dahlan's Version for Islamic Students: The Rasch Analysis

Peer reviewed
PDF on ERIC

Download full text

Wahyu Nanda Eka Saputra; Trikinasih Handayani; Prima Suci Rohmadheny; Rohmatus Naini; Dody Hartanto; Hardi Santosa; Dewi Afra Khairunnisa; Risma Risansyah; Hanan Riati; Faturrahman – Journal of Education and Learning (EduLearn), 2025

The students are urged to do something without expecting anything in return and only in the name of God. Every islamic student becomes something ideal if they can internalize and implement sincerity. Many people are willing to do something because of an ulterior motive. The importance of sincerity in humans is the background for developing a…

Descriptors: Islam, Interrater Reliability, Prosocial Behavior, Muslims

Interdisciplinary Thinking among Seventh-Grade Students in Lower-Secondary Science Education

Peer reviewed
PDF on ERIC

Download full text

Shasha Chen; Shaohui Chi; Zuhao Wang – Journal of Baltic Science Education, 2025

Interdisciplinary thinking is critical for equipping students to apply scientific knowledge and tackle societal challenges across various disciplines, which has been recognized as a key objective of twenty-first century science education. However, research on effective interdisciplinary assessment in secondary school science education is still…

Descriptors: Thinking Skills, Interdisciplinary Approach, Science Instruction, Grade 7

Validity and Reliability of Speech Data in the Norwegian Registry of Cleft Lip and Palate

Peer reviewed

Direct link

Øydis Hide; Dagrun Slettebø Daltveit; Åse Sivertsen; Anne Katherine Hvistendahl; Randi Lovise Kjerstad; Marit Berntsen Kvinnsland; Nina Helen Pedersen; Christina Sørensen – International Journal of Language & Communication Disorders, 2025

Background: Cleft lip and palate (CLP) treatment in Norway is centralized and multidisciplinary, with long-term follow-up from birth to adulthood. The Norwegian Registry of Cleft Lip and Palate was established to ensure high-quality care and enable systematic data collection. Speech data are a key component, assessed by speech--language therapists…

Descriptors: Foreign Countries, Validity, Reliability, Data Collection

Psychometric Properties of the Behavior Assessment System for Children Student Observation System (BASC-3 SOS) with Young Children in Special Education

Peer reviewed

Direct link

Schmidt, Ellyn M.; Rothenberg, W. Andrew; Davidson, Bridget C.; Barnett, Miya; Jent, Jason; Cadenas, Heleny; Fernandez, Corina; Davis, Eileen – Journal of Behavioral Education, 2023

Measuring classroom behavior among young children is important to guide assessment and intervention decisions, yet there is limited literature on appropriate direct observation tools for this purpose. This article describes the psychometric properties of the Behavior Assessment System for Children, Student Observation System (BASC-3 SOS) with 135…

Descriptors: Young Children, Special Education, Child Behavior, Psychometrics

Measuring Acceptance of Block-Based Coding Environments

Peer reviewed

Direct link

Toma, Radu Bogdan – Technology, Knowledge and Learning, 2023

The development of computational thinking skills is attracting attention worldwide. The use of visual or block-based coding in primary schools has gained momentum. Yet, students' acceptance of such coding environments has been neglected in the literature. This study presents a measurement instrument that will allow pursuing such an endeavor. The…

Descriptors: Computation, Thinking Skills, Coding, Measurement

Improving Perceptual Speech Ratings: The Effects of Auditory Training on Judgments of Dysarthric Speech

Peer reviewed

Direct link

Kaila L. Stipancic; Mojgan Golzy; Yunxin Zhao; Louise Pinkerton; Andrea Rohl; Mili Kuruvilla-Dugdale – Journal of Speech, Language, and Hearing Research, 2023

Purpose: Auditory training has been shown to reduce rater variability in perceptual voice assessment. Because rater variability is also a central issue in the auditory-perceptual assessment of dysarthria, this study sought to determine if training produces a meaningful change in rater reliability, criterion validity, and scaling magnitude of four…

Descriptors: Auditory Training, Auditory Perception, Program Effectiveness, Speech Impairments

Using Systematic Social Observations to Measure Crime Prevention through Environmental Design and Disorder: In-situ Observations, Photographs, and Google Street View Imagery

Peer reviewed

Direct link

Sas, Marlies; Snaphaan, Thom; Pauwels, Lieven J. R.; Ponnet, Koen; Hardyns, Wim – Field Methods, 2023

This study focuses on the use of systematic social observations (SSO) to measure crime prevention through environmental design (CPTED) and disorder. To improve knowledge about measurement issues in small area research, SSO is conducted by means of three different methods: in-situ, photographs, and Google Street View (GSV) imagery. By evaluating…

Descriptors: Crime Prevention, Measurement Techniques, Photography, Observation

Visualizing Agreement: Bland-Altman Plots as a Supplement to Inter-Rater Reliability Indices

Peer reviewed

Direct link

Brogan L. Barr; Virginia V. W. McIntosh; Eileen F. Britt; Jennifer Jordan; Janet D. Carter – Measurement: Interdisciplinary Research and Perspectives, 2024

Even when raters demonstrate agreement in the use of a measure, limited score variability or violation of often-ignored statistical assumptions can result in lower reliability estimates than intuitively expected. This article uses data drawn from two randomized controlled trials of schema therapy and cognitive behavioral therapy for the treatment…

Descriptors: Evaluators, Interrater Reliability, Reliability, Measurement Techniques

The Reliability of Simultaneous versus Individual Data Collection during Stuttering Assessment

Peer reviewed

Direct link

Davidow, Jason H.; Ye, Jun; Edge, Robin L. – International Journal of Language & Communication Disorders, 2023

Background: Speech-language pathologists often multitask in order to be efficient with their commonly large caseloads. In stuttering assessment, multitasking often involves collecting multiple measures simultaneously. Aims: The present study sought to determine reliability when collecting multiple measures simultaneously versus individually.…

Descriptors: Graduate Students, Measurement, Reliability, Group Activities

Development of the Social Motor Function Classification System for Children with Autism Spectrum Disorders: A Psychometric Study

Peer reviewed

Direct link

Pin, Tamis W.; So, Vincent K. K.; Siu, Cynthia S. H.; Yip, Sheila S. N.; Cheung, Stella See-wing; Kan, Jenny Yim-mui – Journal of Autism and Developmental Disorders, 2021

To examine reliability and validity of the new Social Motor Function Classification System for Children with Autism Spectrum Disorders (SMFCS-ASD). The SMFCS-ASD reliability was examined on 25 children (62.4 months SD 7.8) with ASD among six physical therapists. The validity study involved 1001 children (57.0 months, SD 9.9) with ASD using the…

Descriptors: Autism, Pervasive Developmental Disorders, Children, Classification

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 16

ProQuest LLC	12
International Journal of…	9
Journal of Speech, Language,…	9
Assessment & Evaluation in…	6
Online Submission	5
Grantee Submission	4
Journal of Autism and…	4
Applied Measurement in…	3
Educational Assessment	3
Journal of Psychoeducational…	3
Language Assessment Quarterly	3
Research in Autism Spectrum…	3
Research in Developmental…	3
American Journal on Mental…	2
Assessment for Effective…	2
Behavioral Disorders	2
Child Development	2
Creativity Research Journal	2
Early Education and…	2
Education and Treatment of…	2
Educational Sciences: Theory…	2
Educational and Psychological…	2
International Journal of…	2
International Journal of…	2
Journal of Early Intervention	2
More ▼

Altszuler, Amy R.	2
Beretvas, S. Natasha	2
Cawthon, Stephanie W.	2
French, Brian F.	2
Ge, Jin Jin	2
Goe, Laura	2
Holdheide, Lynn	2
Katz, Larry	2
Mantzicopoulos, Panayota	2
Merrill, Brittany M.	2
Miller, Tricia	2
Morrow, Anne S.	2
Patrick, Helen	2
Reutzel, D. Ray	2
Shavelson, Richard J.	2
Sibley, Margaret H.	2
Wendel, Erica	2
Williams, Thomas O., Jr.	2
Zwaigenbaum, Lonnie	2
Abbott, Maree J.	1
Abd-Hamid, Nor Hashidah	1
Abdelhalim, Suzan M.	1
Abou-Khalil, Rima	1
Adamson, Katie Anne	1
More ▼

Early Childhood Environment…	3
Draw a Person Test	2
Vineland Adaptive Behavior…	2
Autism Diagnostic Observation…	1
Center for Epidemiologic…	1
Childrens Depression Inventory	1
Clinical Evaluation of…	1
Dynamic Indicators of Basic…	1
Family Adaptability Cohesion…	1
Graduate Record Examinations	1
Iowa Tests of Basic Skills	1
Neale Analysis of Reading…	1
Oral and Written Language…	1
Parenting Stress Index	1
Peabody Developmental Motor…	1
Peabody Picture Vocabulary…	1
Pediatric Evaluation of…	1
Strengths and Difficulties…	1
Test of Gross Motor…	1
Test of Language Development	1
Wechsler Adult Intelligence…	1
Wechsler Individual…	1
Woodcock Johnson Psycho…	1
More ▼