ERIC - Search Results

Publication Date

In 2025	27
Since 2024	95
Since 2021 (last 5 years)	356
Since 2016 (last 10 years)	878
Since 2006 (last 20 years)	2091

Descriptor

Interrater Reliability	3093
Foreign Countries	642
Evaluation Methods	501
Test Reliability	498
Test Validity	406
Correlation	401
Scoring	336
Comparative Analysis	327
Scores	321
Validity	309
Student Evaluation	301
Measures (Individuals)	298
Evaluators	291
Rating Scales	282
Statistical Analysis	268
Higher Education	263
Psychometrics	238
Observation	228
Reliability	228
Scoring Rubrics	214
Test Construction	212
Teaching Methods	208
English (Second Language)	203
Writing Evaluation	202
Intervention	200
More ▼

Education Level

Higher Education	562
Postsecondary Education	408
Elementary Education	280
Secondary Education	177
Early Childhood Education	142
Elementary Secondary Education	119
Middle Schools	108
High Schools	84
Preschool Education	72
Junior High Schools	64
Adult Education	58
Primary Education	55
Kindergarten	45
Grade 4	41
Grade 5	40
Intermediate Grades	40
Grade 1	36
Grade 6	35
Grade 8	32
Grade 3	30
Grade 7	27
Grade 2	25
Grade 10	13
Grade 9	11
Two Year Colleges	8
More ▼

Audience

Researchers	130
Practitioners	42
Teachers	22
Administrators	11
Counselors	3
Policymakers	2

Location

Australia	56
Turkey	52
United Kingdom	46
Canada	45
Netherlands	40
California	37
China	37
United States	30
United Kingdom (England)	24
Taiwan	23
Japan	22
Pennsylvania	22
Florida	21
Germany	21
Sweden	21
Iran	19
North Carolina	19
Hong Kong	17
Texas	17
Georgia	16
South Korea	16
Israel	15
New Zealand	14
Washington	14
South Africa	13
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	13
Individuals with Disabilities…	7
Race to the Top	3
Elementary and Secondary…	2
American Recovery and…	1
Americans with Disabilities…	1
Education Consolidation…	1
Education for All Handicapped…	1
Elementary and Secondary…	1
Improving Americas Schools…	1
Individuals with Disabilities…	1
Individuals with Disabilities…	1
Pell Grant Program	1
Rehabilitation Act 1973…	1
Stewart B McKinney Homeless…	1
Temporary Assistance for…	1
More ▼

What Works Clearinghouse Rating

Meets WWC Standards without Reservations	3
Meets WWC Standards with or without Reservations	3
Does not meet standards	3

Interrater Reliability X

Showing 1 to 15 of 3,093 results Save | Export

Evaluating the Correspondence between Expert Visual Analysis and Quantitative Methods

Peer reviewed

Direct link

Alexandra M. Pierce; Lisa M. H. Sanetti; Melissa A. Collier-Meek; Austin H. Johnson – Grantee Submission, 2024

Visual analysis is the primary methodology used to determine treatment effects from graphed single-case design data. Previous studies have demonstrated mixed findings related to interrater agreement between both expert and novice visual analysts, which represents a critical limitation of visual analysis and supports calls for also presenting…

Descriptors: Graphs, Interrater Reliability, Statistical Analysis, Expertise

Chasing Rainbows? Ofsted's Quest for Inter-Inspector Reliability

Peer reviewed

Direct link

Pearson, Terry – FORUM: for promoting 3-19 comprehensive education, 2023

Ofsted has frequently defended the judgements made during inspections by claiming that inspection ratings are reliable, as shown by the results from the collection of studies the inspectorate has conducted. I outline the inspectorate's view of reliability and problematise the studies that it has carried out, noting that these provide insufficient…

Descriptors: Inspection, Interrater Reliability, Decision Making, Value Judgment

A Systematic Review of Social Validation Procedures in Intervention Research with Transition-Age Autistic Youth

Peer reviewed

Direct link

Kristen Bottema-Beutel; Shannon Crowley LaPoint; So Yoon Kim; Sarah Mohiuddin; Qun Yu; Rachael McKinnon – Exceptional Children, 2024

In this secondary analysis of a previously conducted systematic review, we analyze social validity assessments in intervention research for transition-age autistic youth. Social validity is concerned with the acceptability of the intervention goals, the acceptability and feasibility of the intervention procedures, and the perceived importance of…

Descriptors: Autism Spectrum Disorders, Intervention, Validity, Psychometrics

Technical Adequacy-Reliability

Peer reviewed

Direct link

Susan K. Johnsen – Gifted Child Today, 2025

The author provides information about reliability and areas that educators should examine in determining if an assessment is consistent and trustworthy for use, and how it should be interpreted in making decisions about students. Reliability areas that are discussed in the column include internal consistency, test-retest or stability, inter-scorer…

Descriptors: Test Reliability, Academically Gifted, Student Evaluation, Error of Measurement

The Use of Annotations to Explain Labels: Comparing Results from a Human-Rater Approach to a Deep Learning Approach

Peer reviewed

Direct link

Lottridge, Susan; Woolf, Sherri; Young, Mackenzie; Jafari, Amir; Ormerod, Chris – Journal of Computer Assisted Learning, 2023

Background: Deep learning methods, where models do not use explicit features and instead rely on implicit features estimated during model training, suffer from an explainability problem. In text classification, saliency maps that reflect the importance of words in prediction are one approach toward explainability. However, little is known about…

Descriptors: Documentation, Learning Strategies, Models, Prediction

Inconsistencies in Rater-Based Assessments Mainly Affect Borderline Candidates: But Using Simple Heuristics Might Improve Pass-Fail Decisions

Peer reviewed

Direct link

Stefan K. Schauber; Anne O. Olsen; Erik L. Werner; Morten Magelssen – Advances in Health Sciences Education, 2024

Introduction: Research in various areas indicates that expert judgment can be highly inconsistent. However, expert judgment is indispensable in many contexts. In medical education, experts often function as examiners in rater-based assessments. Here, disagreement between examiners can have far-reaching consequences. The literature suggests that…

Descriptors: Medical Students, Performance Based Assessment, Expertise, Interrater Reliability

Using Bayesian Generalized Structural Equation Modeling to Analyze Latent Agreement

Direct link

McCluskey, Sydne – ProQuest LLC, 2023

Rater comparison analysis is commonly necessary in the social sciences. Conventional approaches to the problem generally focus on calculation of agreement statistics, which provide useful but incomplete information about rater agreement. Importantly, one-number agreement statistics give no indication regarding the nature of disagreements, nor do…

Descriptors: Bayesian Statistics, Structural Equation Models, Interrater Reliability, Beliefs

An Experimental Study of Standard Setting Methods for Diagnostic Profiles

Direct link

Feldberg, Zachary R. – ProQuest LLC, 2023

Cognitive diagnostic models (CDMs) provide pedagogically relevant information in the form of a student profile of multiple binary categorizations of students into mastery or nonmastery statuses on latent traits called attributes. Federal educational accountability requires accountability measures to designate students into one of at least three…

Descriptors: Accountability, Standards, Cutting Scores, Models

"Rater Training" Re-Imagined for Work-Based Assessment in Medical Education

Peer reviewed

Direct link

Tavares, Walter; Kinnear, Benjamin; Schumacher, Daniel J.; Forte, Milena – Advances in Health Sciences Education, 2023

In this perspective, the authors critically examine "rater training" as it has been conceptualized and used in medical education. By "rater training," they mean the educational events intended to "improve" rater performance and contributions during assessment events. Historically, rater training programs have focused…

Descriptors: Medical Education, Interrater Reliability, Evaluation Methods, Training

Test-Retest and Inter-Rater Reliability for Selected Outcomes from a Wearable 3D Inertial Sensor over Different Stable and Unstable Postural Conditions: A Validation Study

Peer reviewed

Direct link

Samuel D'Emanuele; Francesca Nardello; Fabrizio Garau; Diego Campaci; Federico Schena; Cantor Tarperi – Measurement in Physical Education and Exercise Science, 2025

The agreement between a wearable inertial sensor (GYKO, G) and the force platform (P) was assessed by evaluating "test-retest" and "inter-rater reliability." Thirty-eight subjects were enrolled; the selected indices of balance were investigated over foot positions and (un)stable conditions. Intraclass correlation coefficient…

Descriptors: Human Posture, Measurement Equipment, Interrater Reliability, Measurement Techniques

Examining the Psychometric Impact of Targeted and Random Double-Scoring in Mixed-Format Assessments

Peer reviewed

Direct link

Yangmeng Xu; Stefanie A. Wind – Educational Measurement: Issues and Practice, 2025

Double-scoring constructed-response items is a common but costly practice in mixed-format assessments. This study explored the impacts of Targeted Double-Scoring (TDS) and random double-scoring procedures on the quality of psychometric outcomes, including student achievement estimates, person fit, and student classifications under various…

Descriptors: Academic Achievement, Psychometrics, Scoring, Evaluation Methods

The Living Codebook: Documenting the Process of Qualitative Data Analysis

Peer reviewed

Direct link

Victoria Reyes; Elizabeth Bogumil; Levin Elias Welch – Sociological Methods & Research, 2024

Transparency is once again a central issue of debate across types of qualitative research. Work on how to conduct qualitative data analysis, on the other hand, walks us through the step-by-step process on how to code and understand the data we've collected. Although there are a few exceptions, less focus is on transparency regarding…

Descriptors: Qualitative Research, Data Analysis, Guides, Databases

Detecting Rater Bias in Mixed-Format Assessments

Peer reviewed

Direct link

Stefanie A. Wind; Yuan Ge – Measurement: Interdisciplinary Research and Perspectives, 2024

Mixed-format assessments made up of multiple-choice (MC) items and constructed response (CR) items that are scored using rater judgments include unique psychometric considerations. When these item types are combined to estimate examinee achievement, information about the psychometric quality of each component can depend on that of the other. For…

Descriptors: Interrater Reliability, Test Bias, Multiple Choice Tests, Responses

Intercoder Reliability for Use in Qualitative Research and Evaluation

Peer reviewed

Direct link

Monica L. Coleman; Moira Ragan; Tahani Dari – Measurement and Evaluation in Counseling and Development, 2024

Intercoder reliability can increase trustworthiness, accuracy, rigor, collaboration, and power sharing in qualitative research. Though not every qualitative design can utilize intercoder reliability, this article highlights how positivist qualitative research, community-based participatory research, and participatory evaluation all strengthen when…

Descriptors: Interrater Reliability, Qualitative Research, Counseling, Research

Procedural Fidelity Reporting in "The Analysis of Verbal Behavior" from 2007-2021

Peer reviewed

Direct link

Elizabeth J. Preas; Mary E. Halbur; Regina A. Carroll – Analysis of Verbal Behavior, 2024

Procedural fidelity refers to the degree to which procedures for an assessment or intervention (i.e., independent variables) are implemented consistent with the prescribed protocols. Procedural fidelity is an important factor in demonstrating the internal validity of an experiment and clinical treatments. Previous reviews evaluating the inclusion…

Descriptors: Verbal Communication, Behavioral Science Research, Periodicals, Fidelity

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 207

ProQuest LLC	86
Educational and Psychological…	61
Journal of Speech, Language,…	61
Journal of Autism and…	56
Grantee Submission	40
Language Testing	37
Online Submission	35
Assessment & Evaluation in…	33
International Journal of…	33
Research in Developmental…	31
Applied Measurement in…	28
Assessment for Effective…	26
Advances in Health Sciences…	25
ETS Research Report Series	25
Journal of Educational…	24
Educational Measurement:…	22
Measurement in Physical…	20
Language Assessment Quarterly	19
Psychology in the Schools	19
Topics in Early Childhood…	19
Psychological Assessment	18
Educational Assessment	16
Autism: The International…	15
Journal of Consulting and…	15
Personnel Psychology	15
More ▼

Lunz, Mary E.	10
Wind, Stefanie A.	10
Engelhard, George, Jr.	8
Epstein, Michael H.	8
Ingham, Roger J.	8
Johnson, Evelyn S.	8
Matson, Johnny L.	7
McLeod, Bryce D.	7
Moylan, Laura A.	7
Cason, Carolyn L.	6
Cordes, Anne K.	6
Jaeger, Richard M.	6
Johnson, Robert L.	6
Lecavalier, Luc	6
Plake, Barbara S.	6
Tasse, Marc J.	6
Wyse, Adam E.	6
Zheng, Yuzhu	6
Aman, Michael G.	5
Barton, Erin E.	5
Cason, Gerald J.	5
Coniam, David	5
Conroy, Maureen A.	5
Crawford, Angela R.	5
More ▼

Journal Articles	2526
Reports - Research	2212
Reports - Evaluative	515
Speeches/Meeting Papers	272
Reports - Descriptive	163
Tests/Questionnaires	162
Information Analyses	129
Dissertations/Theses -…	89
Opinion Papers	61
Numerical/Quantitative Data	31
Guides - Non-Classroom	11
Books	7
Collected Works - General	3
Guides - Classroom - Teacher	3
Non-Print Media	3
Book/Product Reviews	2
Collected Works - Serials	2
Dissertations/Theses	2
ERIC Digests in Full Text	2
ERIC Publications	2
Guides - General	2
Reports - General	2
Collected Works - Proceedings	1
Reference Materials -…	1
Reference Materials - General	1
More ▼

Test of English as a Foreign…	29
Child Behavior Checklist	18
National Assessment of…	14
Vineland Adaptive Behavior…	14
Autism Diagnostic Observation…	13
Strengths and Difficulties…	10
Woodcock Johnson Tests of…	10
Peabody Picture Vocabulary…	9
Wechsler Intelligence Scale…	9
Behavior Assessment System…	8
Dynamic Indicators of Basic…	8
Early Childhood Environment…	8
Graduate Record Examinations	8
SAT (College Admission Test)	8
International English…	6
Teacher Performance…	6
Advanced Placement…	5
Behavioral and Emotional…	5
Childhood Autism Rating Scale	5
Conners Teacher Rating Scale	5
Draw a Person Test	5
Raven Progressive Matrices	5
ACT Assessment	4
ACTFL Oral Proficiency…	4
Battelle Developmental…	4
More ▼