ERIC - Search Results

Publication Date

In 2025	3
Since 2024	7
Since 2021 (last 5 years)	24
Since 2016 (last 10 years)	70
Since 2006 (last 20 years)	188

Descriptor

Interrater Reliability	228
Reliability	228
Validity	89
Foreign Countries	51
Scores	48
Correlation	47
Measures (Individuals)	38
Statistical Analysis	34
Evaluation Methods	33
Comparative Analysis	27
Observation	27
Psychometrics	26
Rating Scales	23
Student Evaluation	23
Evaluators	21
Scoring Rubrics	21
Children	20
Measurement Techniques	20
Teaching Methods	19
Factor Analysis	18
Intervention	18
Academic Achievement	17
Scoring	17
College Students	16
Construct Validity	16
More ▼

Publication Type

Journal Articles	192
Reports - Research	164
Reports - Evaluative	34
Speeches/Meeting Papers	13
Dissertations/Theses -…	12
Information Analyses	11
Tests/Questionnaires	9
Reports - Descriptive	8
Opinion Papers	6
Guides - Non-Classroom	2
Books	1
Non-Print Media	1
Numerical/Quantitative Data	1
More ▼

Education Level

Higher Education	50
Postsecondary Education	39
Elementary Education	24
Secondary Education	15
Early Childhood Education	13
High Schools	8
Elementary Secondary Education	7
Kindergarten	7
Middle Schools	7
Primary Education	7
Junior High Schools	6
Preschool Education	6
Grade 2	4
Grade 1	3
Grade 3	3
Grade 4	3
Grade 6	3
Intermediate Grades	3
Grade 5	2
Grade 7	1
Grade 8	1
Two Year Colleges	1
More ▼

Audience

Researchers	13
Practitioners	2
Administrators	1
Counselors	1
Policymakers	1

Location

Canada	7
Turkey	6
Australia	5
United States	5
Netherlands	4
Taiwan	4
California	3
China	3
Italy	3
Spain	3
Belgium	2
Finland	2
Indonesia	2
New York	2
North Carolina	2
Norway	2
Singapore	2
Thailand	2
United Kingdom	2
United Kingdom (England)	2
Argentina	1
Bahrain	1
Brazil	1
California (Berkeley)	1
China (Beijing)	1
More ▼

Laws, Policies, & Programs

What Works Clearinghouse Rating

Showing 1 to 15 of 228 results Save | Export

Grading Exams Using Large Language Models: A Comparison between Human and AI Grading of Exams in Higher Education Using ChatGPT

Peer reviewed

Direct link

Jonas Flodén – British Educational Research Journal, 2025

This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…

Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring

Reliability of Ratings of an English Language Arts Curriculum with the Curriculum Evaluation Guidelines

Peer reviewed

Direct link

Matthew K. Burns; Heba Z. Abdelnaby; Jonie B. Welland; Katherine A. Graves; Kari Kurto – Assessment for Effective Intervention, 2024

The current study examined the reliability of The Reading League Curriculum-Evaluation Guidelines (CEGs), which were developed to help school-based teams rate the presence of red flags when considering adopting specific literacy curricula. Coders (n = 30) independently used the CEGs to evaluate a free online English language arts curriculum. The…

Descriptors: English Curriculum, English Instruction, Language Arts, Curriculum Evaluation

New Tests of Rater Drift in Trend Scoring

Peer reviewed

Direct link

John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024

Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…

Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics

Trust the "Process"? When Fundamental Motor Skill Scores Are Reliably Unreliable

Peer reviewed

Direct link

Hulteen, Ryan M.; True, Larissa; Kroc, Edward – Measurement in Physical Education and Exercise Science, 2023

The typical process for assessing inter-rater reliability is facilitated by training raters within a research team. Lacking is an understanding if inter-rater reliability scores "between" research teams demonstrate adequate reliability. This study examined inter-rater reliability between 16 researchers who assessed fundamental motor…

Descriptors: Psychomotor Skills, Scores, Reliability, Interrater Reliability

The Scale of Sincerity Based on Kyai Haji Ahmad Dahlan's Version for Islamic Students: The Rasch Analysis

Peer reviewed
PDF on ERIC

Download full text

Wahyu Nanda Eka Saputra; Trikinasih Handayani; Prima Suci Rohmadheny; Rohmatus Naini; Dody Hartanto; Hardi Santosa; Dewi Afra Khairunnisa; Risma Risansyah; Hanan Riati; Faturrahman – Journal of Education and Learning (EduLearn), 2025

The students are urged to do something without expecting anything in return and only in the name of God. Every islamic student becomes something ideal if they can internalize and implement sincerity. Many people are willing to do something because of an ulterior motive. The importance of sincerity in humans is the background for developing a…

Descriptors: Islam, Interrater Reliability, Prosocial Behavior, Muslims

Interdisciplinary Thinking among Seventh-Grade Students in Lower-Secondary Science Education

Peer reviewed
PDF on ERIC

Download full text

Shasha Chen; Shaohui Chi; Zuhao Wang – Journal of Baltic Science Education, 2025

Interdisciplinary thinking is critical for equipping students to apply scientific knowledge and tackle societal challenges across various disciplines, which has been recognized as a key objective of twenty-first century science education. However, research on effective interdisciplinary assessment in secondary school science education is still…

Descriptors: Thinking Skills, Interdisciplinary Approach, Science Instruction, Grade 7

Psychometric Properties of the Behavior Assessment System for Children Student Observation System (BASC-3 SOS) with Young Children in Special Education

Peer reviewed

Direct link

Schmidt, Ellyn M.; Rothenberg, W. Andrew; Davidson, Bridget C.; Barnett, Miya; Jent, Jason; Cadenas, Heleny; Fernandez, Corina; Davis, Eileen – Journal of Behavioral Education, 2023

Measuring classroom behavior among young children is important to guide assessment and intervention decisions, yet there is limited literature on appropriate direct observation tools for this purpose. This article describes the psychometric properties of the Behavior Assessment System for Children, Student Observation System (BASC-3 SOS) with 135…

Descriptors: Young Children, Special Education, Child Behavior, Psychometrics

Measuring Acceptance of Block-Based Coding Environments

Peer reviewed

Direct link

Toma, Radu Bogdan – Technology, Knowledge and Learning, 2023

The development of computational thinking skills is attracting attention worldwide. The use of visual or block-based coding in primary schools has gained momentum. Yet, students' acceptance of such coding environments has been neglected in the literature. This study presents a measurement instrument that will allow pursuing such an endeavor. The…

Descriptors: Computation, Thinking Skills, Coding, Measurement

Improving Perceptual Speech Ratings: The Effects of Auditory Training on Judgments of Dysarthric Speech

Peer reviewed

Direct link

Kaila L. Stipancic; Mojgan Golzy; Yunxin Zhao; Louise Pinkerton; Andrea Rohl; Mili Kuruvilla-Dugdale – Journal of Speech, Language, and Hearing Research, 2023

Purpose: Auditory training has been shown to reduce rater variability in perceptual voice assessment. Because rater variability is also a central issue in the auditory-perceptual assessment of dysarthria, this study sought to determine if training produces a meaningful change in rater reliability, criterion validity, and scaling magnitude of four…

Descriptors: Auditory Training, Auditory Perception, Program Effectiveness, Speech Impairments

Using Systematic Social Observations to Measure Crime Prevention through Environmental Design and Disorder: In-situ Observations, Photographs, and Google Street View Imagery

Peer reviewed

Direct link

Sas, Marlies; Snaphaan, Thom; Pauwels, Lieven J. R.; Ponnet, Koen; Hardyns, Wim – Field Methods, 2023

This study focuses on the use of systematic social observations (SSO) to measure crime prevention through environmental design (CPTED) and disorder. To improve knowledge about measurement issues in small area research, SSO is conducted by means of three different methods: in-situ, photographs, and Google Street View (GSV) imagery. By evaluating…

Descriptors: Crime Prevention, Measurement Techniques, Photography, Observation

Visualizing Agreement: Bland-Altman Plots as a Supplement to Inter-Rater Reliability Indices

Peer reviewed

Direct link

Brogan L. Barr; Virginia V. W. McIntosh; Eileen F. Britt; Jennifer Jordan; Janet D. Carter – Measurement: Interdisciplinary Research and Perspectives, 2024

Even when raters demonstrate agreement in the use of a measure, limited score variability or violation of often-ignored statistical assumptions can result in lower reliability estimates than intuitively expected. This article uses data drawn from two randomized controlled trials of schema therapy and cognitive behavioral therapy for the treatment…

Descriptors: Evaluators, Interrater Reliability, Reliability, Measurement Techniques

The Reliability of Simultaneous versus Individual Data Collection during Stuttering Assessment

Peer reviewed

Direct link

Davidow, Jason H.; Ye, Jun; Edge, Robin L. – International Journal of Language & Communication Disorders, 2023

Background: Speech-language pathologists often multitask in order to be efficient with their commonly large caseloads. In stuttering assessment, multitasking often involves collecting multiple measures simultaneously. Aims: The present study sought to determine reliability when collecting multiple measures simultaneously versus individually.…

Descriptors: Graduate Students, Measurement, Reliability, Group Activities

Development of the Social Motor Function Classification System for Children with Autism Spectrum Disorders: A Psychometric Study

Peer reviewed

Direct link

Pin, Tamis W.; So, Vincent K. K.; Siu, Cynthia S. H.; Yip, Sheila S. N.; Cheung, Stella See-wing; Kan, Jenny Yim-mui – Journal of Autism and Developmental Disorders, 2021

To examine reliability and validity of the new Social Motor Function Classification System for Children with Autism Spectrum Disorders (SMFCS-ASD). The SMFCS-ASD reliability was examined on 25 children (62.4 months SD 7.8) with ASD among six physical therapists. The validity study involved 1001 children (57.0 months, SD 9.9) with ASD using the…

Descriptors: Autism, Pervasive Developmental Disorders, Children, Classification

Intra- and Inter-Rater Reliability of the Behaviour Mapping Schedule: A Direct Observational Tool for Classifying Children's Play Behaviour

Peer reviewed

Direct link

Dankiw, Kylie A.; Baldock, Katherine L.; Kumar, Saravana; Tsiros, Margarita D. – Australasian Journal of Early Childhood, 2021

Identifying and describing children's play behaviours is an important component of evaluating child development. The Behaviour Mapping Schedule is a direct observational tool which aims to describe and quantify children's play behaviours but is yet to undergo reliability testing. This study aimed to determine the intra- and inter-rater reliability…

Descriptors: Interrater Reliability, Classification, Child Behavior, Play

Developing a Tool for Measuring Student Orientations with Respect to Understanding in Mathematical Learning

Peer reviewed
PDF on ERIC

Download full text

Siqi Huang – North American Chapter of the International Group for the Psychology of Mathematics Education, 2023

The goal of this paper is twofold. First, the paper clarifies and elaborates on an important theoretical construct called orientation with respect to understanding in mathematics, which denotes the degree to which students exhibit an inclination towards and demonstrate an earnest concern for understanding in mathematical learning. Second, the…

Descriptors: Mathematics Instruction, Teaching Methods, Problem Solving, Reliability

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 16

ProQuest LLC	12
Journal of Speech, Language,…	9
International Journal of…	8
Assessment & Evaluation in…	6
Online Submission	5
Grantee Submission	4
Journal of Autism and…	4
Applied Measurement in…	3
Educational Assessment	3
Journal of Psychoeducational…	3
Language Assessment Quarterly	3
Research in Autism Spectrum…	3
Research in Developmental…	3
American Journal on Mental…	2
Assessment for Effective…	2
Behavioral Disorders	2
Child Development	2
Creativity Research Journal	2
Early Education and…	2
Education and Treatment of…	2
Educational Sciences: Theory…	2
Educational and Psychological…	2
International Journal of…	2
International Journal of…	2
Journal of Early Intervention	2
More ▼

Altszuler, Amy R.	2
Beretvas, S. Natasha	2
Cawthon, Stephanie W.	2
French, Brian F.	2
Ge, Jin Jin	2
Goe, Laura	2
Holdheide, Lynn	2
Katz, Larry	2
Mantzicopoulos, Panayota	2
Merrill, Brittany M.	2
Miller, Tricia	2
Morrow, Anne S.	2
Patrick, Helen	2
Reutzel, D. Ray	2
Shavelson, Richard J.	2
Sibley, Margaret H.	2
Wendel, Erica	2
Williams, Thomas O., Jr.	2
Zwaigenbaum, Lonnie	2
Abbott, Maree J.	1
Abd-Hamid, Nor Hashidah	1
Abdelhalim, Suzan M.	1
Abou-Khalil, Rima	1
Adamson, Katie Anne	1
More ▼

Early Childhood Environment…	3
Draw a Person Test	2
Vineland Adaptive Behavior…	2
Autism Diagnostic Observation…	1
Center for Epidemiologic…	1
Childrens Depression Inventory	1
Clinical Evaluation of…	1
Dynamic Indicators of Basic…	1
Family Adaptability Cohesion…	1
Graduate Record Examinations	1
Iowa Tests of Basic Skills	1
Neale Analysis of Reading…	1
Oral and Written Language…	1
Parenting Stress Index	1
Peabody Developmental Motor…	1
Peabody Picture Vocabulary…	1
Pediatric Evaluation of…	1
Strengths and Difficulties…	1
Test of Gross Motor…	1
Test of Language Development	1
Wechsler Adult Intelligence…	1
Wechsler Individual…	1
Woodcock Johnson Psycho…	1
More ▼