ERIC - Search Results

Publication Date

In 2025	27
Since 2024	95
Since 2021 (last 5 years)	356
Since 2016 (last 10 years)	878
Since 2006 (last 20 years)	2091

Descriptor

Interrater Reliability	3093
Foreign Countries	642
Evaluation Methods	501
Test Reliability	498
Test Validity	406
Correlation	401
Scoring	336
Comparative Analysis	327
Scores	321
Validity	309
Student Evaluation	301
Measures (Individuals)	298
Evaluators	291
Rating Scales	282
Statistical Analysis	268
Higher Education	263
Psychometrics	238
Observation	228
Reliability	228
Scoring Rubrics	214
Test Construction	212
Teaching Methods	208
English (Second Language)	203
Writing Evaluation	202
Intervention	200
More ▼

Education Level

Higher Education	562
Postsecondary Education	408
Elementary Education	280
Secondary Education	177
Early Childhood Education	142
Elementary Secondary Education	119
Middle Schools	108
High Schools	84
Preschool Education	72
Junior High Schools	64
Adult Education	58
Primary Education	55
Kindergarten	45
Grade 4	41
Grade 5	40
Intermediate Grades	40
Grade 1	36
Grade 6	35
Grade 8	32
Grade 3	30
Grade 7	27
Grade 2	25
Grade 10	13
Grade 9	11
Two Year Colleges	8
More ▼

Audience

Researchers	130
Practitioners	42
Teachers	22
Administrators	11
Counselors	3
Policymakers	2

Location

Australia	56
Turkey	52
United Kingdom	46
Canada	45
Netherlands	40
California	37
China	37
United States	30
United Kingdom (England)	24
Taiwan	23
Japan	22
Pennsylvania	22
Florida	21
Germany	21
Sweden	21
Iran	19
North Carolina	19
Hong Kong	17
Texas	17
Georgia	16
South Korea	16
Israel	15
New Zealand	14
Washington	14
South Africa	13
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	13
Individuals with Disabilities…	7
Race to the Top	3
Elementary and Secondary…	2
American Recovery and…	1
Americans with Disabilities…	1
Education Consolidation…	1
Education for All Handicapped…	1
Elementary and Secondary…	1
Improving Americas Schools…	1
Individuals with Disabilities…	1
Individuals with Disabilities…	1
Pell Grant Program	1
Rehabilitation Act 1973…	1
Stewart B McKinney Homeless…	1
Temporary Assistance for…	1
More ▼

What Works Clearinghouse Rating

Meets WWC Standards without Reservations	3
Meets WWC Standards with or without Reservations	3
Does not meet standards	3

Showing 106 to 120 of 3,093 results Save | Export

Automated Classification of Elementary Instructional Activities: Analyzing the Consistency of Human Annotations

Peer reviewed
PDF on ERIC

Download full text

Jonathan K. Foster; Peter Youngs; Rachel van Aswegen; Samarth Singh; Ginger S. Watson; Scott T. Acton – Journal of Learning Analytics, 2024

Despite a tremendous increase in the use of video for conducting research in classrooms as well as preparing and evaluating teachers, there remain notable challenges to using classroom videos at scale, including time and financial costs. Recent advances in artificial intelligence could make the process of analyzing, scoring, and cataloguing videos…

Descriptors: Learning Analytics, Automation, Classification, Artificial Intelligence

Agreement between Visual Inspection and Objective Analysis Methods: A Replication and Extension

Peer reviewed

Direct link

Taylor, Tessa; Lanovaz, Marc J. – Journal of Applied Behavior Analysis, 2022

Behavior analysts typically rely on visual inspection of single-case experimental designs to make treatment decisions. However, visual inspection is subjective, which has led to the development of supplemental objective methods such as the conservative dual-criteria method. To replicate and extend a study conducted by Wolfe et al. (2018) on the…

Descriptors: Visual Perception, Artificial Intelligence, Decision Making, Evaluators

Do Peers Share the Same Criteria for Assessing Grant Applications?

Peer reviewed

Direct link

Hug, Sven E.; Ochsner, Michael – Research Evaluation, 2022

This study examines a basic assumption of peer review, namely, the idea that there is a consensus on evaluation criteria among peers, which is a necessary condition for the reliability of peer judgements. Empirical evidence indicating that there is no consensus or more than one consensus would offer an explanation for the "disagreement…

Descriptors: Peer Evaluation, Grants, Evaluation Criteria, Interrater Reliability

Investigating Constructed-Response Scoring over Time: The Effects of Study Design on Trend Rescore Statistics. Research Report. ETS RR-22-15

Peer reviewed
PDF on ERIC

Download full text

Donoghue, John R.; McClellan, Catherine A.; Hess, Melinda R. – ETS Research Report Series, 2022

When constructed-response items are administered for a second time, it is necessary to evaluate whether the current Time B administration's raters have drifted from the scoring of the original administration at Time A. To study this, Time A papers are sampled and rescored by Time B scorers. Commonly the scores are compared using the proportion of…

Descriptors: Item Response Theory, Test Construction, Scoring, Testing

Evidence-Based Evaluation of Student and Marker Performances in Assessment and Examination

Peer reviewed

Direct link

Ole J. Kemi – Advances in Physiology Education, 2025

Students are assessed by coursework and/or exams, all of which are marked by assessors (markers). Student and marker performances are then subject to end-of-session board of examiner handling and analysis. This occurs annually and is the basis for evaluating students but also the wider learning and teaching efficiency of an academic institution.…

Descriptors: Undergraduate Students, Evaluation Methods, Evaluation Criteria, Academic Standards

The Scale of Sincerity Based on Kyai Haji Ahmad Dahlan's Version for Islamic Students: The Rasch Analysis

Peer reviewed
PDF on ERIC

Download full text

Wahyu Nanda Eka Saputra; Trikinasih Handayani; Prima Suci Rohmadheny; Rohmatus Naini; Dody Hartanto; Hardi Santosa; Dewi Afra Khairunnisa; Risma Risansyah; Hanan Riati; Faturrahman – Journal of Education and Learning (EduLearn), 2025

The students are urged to do something without expecting anything in return and only in the name of God. Every islamic student becomes something ideal if they can internalize and implement sincerity. Many people are willing to do something because of an ulterior motive. The importance of sincerity in humans is the background for developing a…

Descriptors: Islam, Interrater Reliability, Prosocial Behavior, Muslims

Qualitative Coding with GPT-4: Where It Works Better

Peer reviewed
PDF on ERIC

Download full text

Xiner Liu; Andres Felipe Zambrano; Ryan S. Baker; Amanda Barany; Jaclyn Ocumpaugh; Jiayi Zhang; Maciej Pankiewicz; Nidhi Nasiar; Zhanlan Wei – Journal of Learning Analytics, 2025

This study explores the potential of the large language model GPT-4 as an automated tool for qualitative data analysis by educational researchers, exploring which techniques are most successful for different types of constructs. Specifically, we assess three different prompt engineering strategies -- Zero-shot, Few-shot, and Fewshot with…

Descriptors: Coding, Artificial Intelligence, Automation, Data Analysis

An Approach for Being Able to Use the Options of Calculating Inter-Coder Reliability Manually and through Software in Qualitative Research of Education and Training in Sports

Peer reviewed
PDF on ERIC

Download full text

Sevilmis, Ali; Yildiz, Özer – International Journal of Progressive Education, 2021

Reliability that can be proved by numeric indicators in quantitative studies has become a very discussible issue. The reason for this is to be thought that in qualitative researches, reliability is not based on positive perspective and those forming reliability criteria is difficult. However, for testing the reliability of a qualitative study or…

Descriptors: Interrater Reliability, Qualitative Research, Educational Research, Physical Education

Skin Color Matters in the Latinx Community: A Call for Action in Research, Training, and Practice

Peer reviewed

Direct link

Fuentes, Milton A.; Reyes-Portillo, Jazmin A.; Tineo, Petty; Gonzalez, Kenny; Butt, Mamona – Hispanic Journal of Behavioral Sciences, 2021

While skin color is relevant and important in the Latinx community, as it is associated with colorism, little is known about how often it is measured or the best way to measure it. This article presents results from two studies examining these key concerns in three prominent journals, where Latinx research is typically published (i.e., the…

Descriptors: Hispanic Americans, Measures (Individuals), Undergraduate Students, Social Bias

Interrater Reliability in Second Language Meta-Analyses: The Case of Categorical Moderators

Peer reviewed

Direct link

Norouzian, Reza – Studies in Second Language Acquisition, 2021

There has recently been a surge of interest in improving the replicability of second language (L2) research. However, less attention is paid to replicability in the context of L2 meta-analyses. I argue that conducting interrater reliability (IRR) analyses is a key step toward improving the replicability of L2 meta-analyses. To that end, I first…

Descriptors: Interrater Reliability, Second Languages, Language Research, Meta Analysis

A Comparison of Manual versus Automated Quantitative Production Analysis of Connected Speech

Peer reviewed

Direct link

Fromm, Davida; Katta, Saketh; Paccione, Mason; Hecht, Sophia; Greenhouse, Joel; MacWhinney, Brian; Schnur, Tatiana T. – Journal of Speech, Language, and Hearing Research, 2021

Purpose: Analysis of connected speech in the field of adult neurogenic communication disorders is essential for research and clinical purposes, yet time and expertise are often cited as limiting factors. The purpose of this project was to create and evaluate an automated program to score and compute the measures from the Quantitative Production…

Descriptors: Speech, Automation, Statistical Analysis, Adults

Using Many-Facet Rasch Measurement and Generalizability Theory to Explore Rater Effects for Direct Behavior Rating--Multi-Item Scales

Peer reviewed

Direct link

Anthony, Christopher J.; Styck, Kara M.; Volpe, Robert J.; Robert, Christopher R. – School Psychology, 2023

Although originally conceived of as a marriage of direct behavioral observation and indirect behavior rating scales, recent research has indicated that Direct Behavior Ratings (DBRs) are affected by rater idiosyncrasies (rater effects) similar to other indirect forms of behavioral assessment. Most of this research has been conducted using…

Descriptors: Item Response Theory, Generalizability Theory, Interrater Reliability, Behavior Rating Scales

Assessing Handwriting in Preschool-Aged Children: Reliability and Internal Consistency of the "Just Write!" Tool

Peer reviewed

Direct link

Bolton, Tiffany; Stevenson, Brittney; Janes, William – Journal of Occupational Therapy, Schools & Early Intervention, 2023

Researchers utilized a cross-sectional secondary analysis of data within an ongoing non-randomized controlled trial study design to establish the reliability and internal consistency of a novel handwriting assessment for preschoolers, the Just Write! (JW), written by the authors. Seventy-eight children from an area preschool participated in the…

Descriptors: Handwriting, Writing Skills, Writing Evaluation, Preschool Children

Examining Rating Quality in Rater-Mediated Activities for Standard-Item Alignment Research

Direct link

Yvette Jackson – ProQuest LLC, 2023

Rater-mediated activities in educational research occur when an expert judge or rater utilizes an instrument to judge persons or items and generates scale scores. Scale scores are from a subjective judgment and must undergo a quality control measure called rating quality. Rating quality in this study is broadly defined as the extent to which…

Descriptors: Educational Research, Evaluators, Test Theory, Item Response Theory

Different Methods for Assessing Preservice Teachers' Instruction: Why Measures Matter

Peer reviewed

Direct link

Arielle Boguslav; Julie Cohen – Journal of Teacher Education, 2024

Teacher preparation programs are increasingly expected to use data on preservice teacher (PST) skills to drive program improvement and provide targeted supports. Observational ratings are especially vital, but also prone to measurement issues. Scores may be influenced by factors unrelated to PSTs' instructional skills, including rater standards.…

Descriptors: Preservice Teachers, Measures (Individuals), Evaluation Problems, Teaching Skills

« Previous Page | Next Page »

Pages: 1 | ... | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | ... | 207

ProQuest LLC	86
Educational and Psychological…	61
Journal of Speech, Language,…	61
Journal of Autism and…	56
Grantee Submission	40
Language Testing	37
Online Submission	35
Assessment & Evaluation in…	33
International Journal of…	33
Research in Developmental…	31
Applied Measurement in…	28
Assessment for Effective…	26
Advances in Health Sciences…	25
ETS Research Report Series	25
Journal of Educational…	24
Educational Measurement:…	22
Measurement in Physical…	20
Language Assessment Quarterly	19
Psychology in the Schools	19
Topics in Early Childhood…	19
Psychological Assessment	18
Educational Assessment	16
Autism: The International…	15
Journal of Consulting and…	15
Personnel Psychology	15
More ▼

Lunz, Mary E.	10
Wind, Stefanie A.	10
Engelhard, George, Jr.	8
Epstein, Michael H.	8
Ingham, Roger J.	8
Johnson, Evelyn S.	8
Matson, Johnny L.	7
McLeod, Bryce D.	7
Moylan, Laura A.	7
Cason, Carolyn L.	6
Cordes, Anne K.	6
Jaeger, Richard M.	6
Johnson, Robert L.	6
Lecavalier, Luc	6
Plake, Barbara S.	6
Tasse, Marc J.	6
Wyse, Adam E.	6
Zheng, Yuzhu	6
Aman, Michael G.	5
Barton, Erin E.	5
Cason, Gerald J.	5
Coniam, David	5
Conroy, Maureen A.	5
Crawford, Angela R.	5
More ▼

Journal Articles	2526
Reports - Research	2212
Reports - Evaluative	515
Speeches/Meeting Papers	272
Reports - Descriptive	163
Tests/Questionnaires	162
Information Analyses	129
Dissertations/Theses -…	89
Opinion Papers	61
Numerical/Quantitative Data	31
Guides - Non-Classroom	11
Books	7
Collected Works - General	3
Guides - Classroom - Teacher	3
Non-Print Media	3
Book/Product Reviews	2
Collected Works - Serials	2
Dissertations/Theses	2
ERIC Digests in Full Text	2
ERIC Publications	2
Guides - General	2
Reports - General	2
Collected Works - Proceedings	1
Reference Materials -…	1
Reference Materials - General	1
More ▼

Test of English as a Foreign…	29
Child Behavior Checklist	18
National Assessment of…	14
Vineland Adaptive Behavior…	14
Autism Diagnostic Observation…	13
Strengths and Difficulties…	10
Woodcock Johnson Tests of…	10
Peabody Picture Vocabulary…	9
Wechsler Intelligence Scale…	9
Behavior Assessment System…	8
Dynamic Indicators of Basic…	8
Early Childhood Environment…	8
Graduate Record Examinations	8
SAT (College Admission Test)	8
International English…	6
Teacher Performance…	6
Advanced Placement…	5
Behavioral and Emotional…	5
Childhood Autism Rating Scale	5
Conners Teacher Rating Scale	5
Draw a Person Test	5
Raven Progressive Matrices	5
ACT Assessment	4
ACTFL Oral Proficiency…	4
Battelle Developmental…	4
More ▼