ERIC - Search Results

Publication Date

In 2025	1
Since 2024	4
Since 2021 (last 5 years)	9
Since 2016 (last 10 years)	24
Since 2006 (last 20 years)	98

Descriptor

Interrater Reliability	163
Evaluation Methods	39
Student Evaluation	33
Test Reliability	31
Scoring	26
Test Validity	23
Foreign Countries	19
Test Construction	19
Evaluation Criteria	18
Rating Scales	16
Evaluators	15
Higher Education	15
Measurement Techniques	14
Scores	13
Writing Evaluation	13
Test Items	12
Correlation	11
Error of Measurement	11
Grading	11
Interviews	11
Observation	11
Feedback (Response)	10
Measures (Individuals)	10
Models	10
Psychometrics	10
More ▼

Publication Type

Reports - Descriptive	163
Journal Articles	138
Numerical/Quantitative Data	6
Speeches/Meeting Papers	6
Tests/Questionnaires	4
Guides - Non-Classroom	3
Reports - Evaluative	3
Reports - Research	2
Opinion Papers	1

Education Level

Higher Education	30
Postsecondary Education	15
Elementary Education	11
Elementary Secondary Education	10
Adult Education	5
Early Childhood Education	5
Grade 3	3
Grade 5	3
Primary Education	3
Grade 1	2
High Schools	2
Middle Schools	2
Secondary Education	2
Grade 2	1
Grade 4	1
Intermediate Grades	1
Kindergarten	1
Preschool Education	1
Two Year Colleges	1
More ▼

Audience

Practitioners	6
Researchers	6
Teachers	5
Counselors	2
Administrators	1

Location

Australia	3
Florida	3
United Kingdom	3
United Kingdom (England)	3
New Mexico	2
Singapore	2
United States	2
Arizona	1
California	1
China	1
Delaware	1
Finland	1
Georgia	1
Hong Kong	1
Ireland (Dublin)	1
Japan	1
Kentucky	1
Malaysia	1
Michigan	1
Rhode Island	1
Tennessee	1
Texas	1
United Kingdom (Great Britain)	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	2
Individuals with Disabilities…	1
Race to the Top	1

Assessments and Surveys

Test of English as a Foreign…	3
Autism Diagnostic Observation…	1
Conners Teacher Rating Scale	1
Florida Comprehensive…	1
International English…	1
Minnesota Tests of Creative…	1
National Assessment of…	1
Program for International…	1
Stanford Binet Intelligence…	1
Test of English for…	1
Wechsler Intelligence Scale…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 163 results Save | Export

Technical Adequacy-Reliability

Peer reviewed

Direct link

Susan K. Johnsen – Gifted Child Today, 2025

The author provides information about reliability and areas that educators should examine in determining if an assessment is consistent and trustworthy for use, and how it should be interpreted in making decisions about students. Reliability areas that are discussed in the column include internal consistency, test-retest or stability, inter-scorer…

Descriptors: Test Reliability, Academically Gifted, Student Evaluation, Error of Measurement

The Living Codebook: Documenting the Process of Qualitative Data Analysis

Peer reviewed

Direct link

Victoria Reyes; Elizabeth Bogumil; Levin Elias Welch – Sociological Methods & Research, 2024

Transparency is once again a central issue of debate across types of qualitative research. Work on how to conduct qualitative data analysis, on the other hand, walks us through the step-by-step process on how to code and understand the data we've collected. Although there are a few exceptions, less focus is on transparency regarding…

Descriptors: Qualitative Research, Data Analysis, Guides, Databases

Defining in Detail and Evaluating Reliability of DSM-5 Criteria for Autism Spectrum Disorder (ASD) among Children

Peer reviewed

Direct link

Rice, C. E.; Carpenter, L. A.; Morrier, M. J.; Lord, C.; DiRienzo, M.; Boan, A.; Skowyra, C.; Fusco, A.; Baio, J.; Esler, A.; Zahorodny, W.; Hobson, N.; Mars, A.; Thurm, A.; Bishop, S.; Wiggins, L. D. – Journal of Autism and Developmental Disorders, 2022

This paper describes a process to define a comprehensive list of exemplars for seven core Diagnostic and Statistical Manual (DSM) diagnostic criteria for autism spectrum disorder (ASD), and report on interrater reliability in applying these exemplars to determine ASD case classification. Clinicians completed an iterative process to map specific…

Descriptors: Autism Spectrum Disorders, Clinical Diagnosis, Test Reliability, Interrater Reliability

Large-Sample Variance of Fleiss Generalized Kappa

Peer reviewed

Direct link

Gwet, Kilem L. – Educational and Psychological Measurement, 2021

Cohen's kappa coefficient was originally proposed for two raters only, and it later extended to an arbitrarily large number of raters to become what is known as Fleiss' generalized kappa. Fleiss' generalized kappa and its large-sample variance are still widely used by researchers and were implemented in several software packages, including, among…

Descriptors: Sample Size, Statistical Analysis, Interrater Reliability, Computation

The Politics of Reading Textbooks: Intergenerational and International Reflections on China

Peer reviewed

Direct link

Liz Jackson; Michael W. Apple; Fei Yan; Jason Cong Lin; Chenxi Jiang; Tongzhou Li; Edward Vickers – Educational Philosophy and Theory, 2024

In this collective essay the authors consider the nature and consequences of reading and researching across difference in an international and intergenerational team, whose core members are focused on understanding how curriculum operates and the nature of textbook representation of diversity in Mainland China, Hong Kong, Taiwan, and Macau.…

Descriptors: Foreign Countries, Textbooks, Reading Research, Educational Research

Best Practices for Constructed-Response Scoring. Research Report. ETS RR-22-17

Peer reviewed
PDF on ERIC

Download full text

McCaffrey, Daniel F.; Casabianca, Jodi M.; Ricker-Pedley, Kathryn L.; Lawless, René R.; Wendler, Cathy – ETS Research Report Series, 2022

This document describes a set of best practices for developing, implementing, and maintaining the critical process of scoring constructed-response tasks. These practices address both the use of human raters and automated scoring systems as part of the scoring process and cover the scoring of written, spoken, performance, or multimodal responses.…

Descriptors: Best Practices, Scoring, Test Format, Computer Assisted Testing

The Value of Expanding Perspectives on Assessment

Peer reviewed

Direct link

Janice Kinghorn; Katherine McGuire; Bethany L. Miller; Aaron Zimmerman – Assessment Update, 2024

In this article, the authors share their reflections on how different experiences and paradigms have broadened their understanding of the work of assessment in higher education. As they collaborated to create a panel for the 2024 International Conference on Assessing Quality in Higher Education, they recognized that they, as assessment…

Descriptors: Higher Education, Assessment Literacy, Evaluation Criteria, Evaluation Methods

Assessment Strategies for Reflective Learning in the Workplace: A Pragmatic Approach

Peer reviewed

Direct link

Roessger, Kevin M. – Adult Learning, 2020

Practitioners often struggle to assess reflective learning in the workplace because of difficulties conceptualizing reflection and its effects in the workplace. This article addresses this problem by offering a pragmatic approach to assessment that asks practitioners to specify why they are using reflection, what they are hoping to gain from it,…

Descriptors: Workplace Learning, Evaluation Methods, Reflection, Adult Education

The Rashomon Effect: Which Features of a Speaker's Talk Do Listeners Notice?

Peer reviewed

Direct link

Seedhouse, Paul; Satar, Müge – Classroom Discourse, 2023

The same L2 speaking performance may be analysed and evaluated in very different ways by different teachers or raters. We present a new, technology-assisted research design which opens up to investigation the trajectories of convergence and divergence between raters. We tracked and recorded what different raters noticed when, whilst grading a…

Descriptors: Language Tests, English (Second Language), Second Language Learning, Oral Language

Pedagogical Considerations for Examining Rater Variability in Rater-Mediated Assessments: A Three-Model Framework

Peer reviewed

Direct link

Wesolowski, Brian C.; Wind, Stefanie A. – Journal of Educational Measurement, 2019

Rater-mediated assessments are a common methodology for measuring persons, investigating rater behavior, and/or defining latent constructs. The purpose of this article is to provide a pedagogical framework for examining rater variability in the context of rater-mediated assessments using three distinct models. The first model is the observation…

Descriptors: Interrater Reliability, Models, Observation, Measurement

Modeling Rater Response Processes in Evaluating Score Meaning

Peer reviewed

Direct link

Lane, Suzanne – Journal of Educational Measurement, 2019

Rater-mediated assessments require the evaluation of the accuracy and consistency of the inferences made by the raters to ensure the validity of score interpretations and uses. Modeling rater response processes allows for a better understanding of how raters map their representations of the examinee performance to their representation of the…

Descriptors: Responses, Accuracy, Validity, Interrater Reliability

Digital Module 12: Think-Aloud Interviews and Cognitive Labs https://ncme.elevate.commpartners.com

Peer reviewed

Direct link

Leighton, Jacqueline P.; Lehman, Blair – Educational Measurement: Issues and Practice, 2020

In this digital ITEMS module, Dr. Jacqueline Leighton and Dr. Blair Lehman review differences between think-aloud interviews to measure problem-solving processes and cognitive labs to measure comprehension processes. Learners are introduced to historical, theoretical, and procedural differences between these methods and how to use and analyze…

Descriptors: Protocol Analysis, Interviews, Problem Solving, Cognitive Processes

What You Don't Know about Measurement Error--And Why You Should Care

Direct link

Lichtenstein, Robert – Communique, 2020

Appropriate interpretation of assessment data requires an appreciation that tools are subject to measurement error. School psychologists recognize, at least on an intellectual level, that measures are imperfect--that test scores and other quantitative measures (e.g., rating scales, systematic behavioral observations) are best estimates of…

Descriptors: Error of Measurement, Test Reliability, Pretests Posttests, Standardized Tests

Issues with, and Insights for, Large-Scale Studies of Classroom Mathematical Instruction

Peer reviewed

Direct link

Bieda, Kristen N.; Salloum, Serena J.; Hu, Sihua; Sweeny, Shannon; Lane, John; Torphy, Kaitlin – Journal of Classroom Interaction, 2020

This paper discusses the challenges and lessons learned from conducting observations to measure the quality of classroom practice for a large-scale study of elementary teachers' mathematics instruction. Specifically, this paper shares our process for obtaining valid data for quality of elementary mathematics instruction; what we learned can inform…

Descriptors: Mathematics Instruction, Classroom Observation Techniques, Elementary School Teachers, Interrater Reliability

The Reliability and Consequential Validity of Two Teacher-Administered Student Mathematics Diagnostic Assessments. Study Snapshot. REL 2020-039

Peer reviewed
PDF on ERIC

Download full text

Regional Educational Laboratory Southeast, 2020

Teachers need to assess their students' current level of mathematical understanding to provide appropriate interventions for students who are struggling. Several school districts in Georgia currently use two assessments for this purpose--the Global Strategy Stage (GloSS) and the Individual Knowledge Assessment of Number (IKAN). The IKAN is…

Descriptors: Mathematics Tests, Diagnostic Tests, Test Reliability, Test Validity

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11

Educational and Psychological…	9
Assessment & Evaluation in…	3
Educational Measurement:…	3
Journal of Autism and…	3
Journal of Educational…	3
Journal of Educational and…	3
Academic Psychiatry	2
Assessment Update	2
Education Digest: Essential…	2
International Journal of…	2
Journal of Classroom…	2
Journal of Creative Behavior	2
Modern Language Journal	2
New Mexico Public Education…	2
Popular Measurement	2
Psychometrika	2
Regional Educational…	2
Research Papers in Education	2
Teaching English in the…	2
Action in Teacher Education	1
Adult Learning	1
Afterschool Matters	1
American Annals of the Deaf	1
American Journal of Distance…	1
American Journal of Evaluation	1
More ▼

Schuster, Christof	3
Chang, Lei	2
Janson, Harald	2
Olsson, Ulf	2
Sadler, D. Royce	2
VanSciver, James H.	2
Vos, Hans J.	2
Aaron Zimmerman	1
Abedi, Jamal	1
Albertini, John	1
Aleong, Chandra	1
Almeida, M. Joao C. A.	1
Amabile, Teresa M.	1
Amidon, Edmund	1
Anderson, William L.	1
Arbaiy, Nurieze	1
Archbald, Doug	1
Artman, Kathleen	1
Avery, Marybell	1
Azzam, Tarek	1
Baio, J.	1
Baird, Constance M.	1
Baird, Jo-Anne	1
Bardhoshi, Gerta	1
More ▼