ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	8
Since 2017 (last 10 years)	20
Since 2007 (last 20 years)	61

Descriptor

Comparative Analysis	65
Correlation	65
Interrater Reliability	65
Foreign Countries	26
Statistical Analysis	19
Scores	17
Evaluators	12
Scoring	12
Rating Scales	11
Second Language Learning	11
Student Evaluation	10
Validity	10
Evaluation Methods	9
Writing Evaluation	9
Accuracy	8
Measures (Individuals)	8
Second Language Instruction	8
Computer Software	7
English (Second Language)	7
Essays	7
Interpersonal Communication	7
Questionnaires	7
Autism	6
Computer Assisted Testing	6
Elementary School Students	6
More ▼

Publication Type

Journal Articles	62
Reports - Research	54
Reports - Evaluative	8
Tests/Questionnaires	7
Information Analyses	2
Collected Works - Proceedings	1
Dissertations/Theses -…	1
Numerical/Quantitative Data	1

Education Level

Higher Education	18
Postsecondary Education	17
Secondary Education	7
Elementary Education	4
Elementary Secondary Education	4
High Schools	4
Early Childhood Education	2
Grade 11	2
Grade 4	2
Intermediate Grades	2
Middle Schools	2
Preschool Education	2
Grade 1	1
Grade 10	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
Junior High Schools	1
More ▼

Audience

Location

Netherlands	5
Asia	2
Canada	2
China	2
Estonia	2
Florida	2
Greece	2
Hong Kong	2
Japan	2
Pennsylvania	2
Philippines	2
Singapore	2
Spain	2
Turkey	2
United Kingdom	2
United Kingdom (England)	2
Australia	1
Brazil	1
California	1
Canada (Toronto)	1
Connecticut	1
Denmark	1
Egypt	1
Georgia	1
Germany	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Autism Diagnostic Observation…	2
Dynamic Indicators of Basic…	1
Georgia Criterion Referenced…	1
Kaufman Brief Intelligence…	1
Multifactor Leadership…	1
National Assessment of…	1
Praxis Series	1
SAT (College Admission Test)	1
Wechsler Intelligence Scale…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 65 results Save | Export

Graders of the Future: Comparing the Consistency and Accuracy of GPT4 and Pre-Service Teachers in Physics Essay Question Assessments

Peer reviewed
PDF on ERIC

Download full text

Yubin Xu; Lin Liu; Jianwen Xiong; Guangtian Zhu – Journal of Baltic Science Education, 2025

As the development and application of large language models (LLMs) in physics education progress, the well-known AI-based chatbot ChatGPT4 has presented numerous opportunities for educational assessment. Investigating the potential of AI tools in practical educational assessment carries profound significance. This study explored the comparative…

Descriptors: Physics, Artificial Intelligence, Computer Software, Accuracy

Rater Connections and the Detection of Bias in Performance Assessment

Peer reviewed

Direct link

Wind, Stefanie A. – Measurement: Interdisciplinary Research and Perspectives, 2022

In many performance assessments, one or two raters from the complete rater pool scores each performance, resulting in a sparse rating design, where there are limited observations of each rater relative to the complete sample of students. Although sparse rating designs can be constructed to facilitate estimation of student achievement, the…

Descriptors: Evaluators, Bias, Identification, Performance Based Assessment

Meta-Analysis of Inter-Rater Agreement and Discrepancy Between Human and Automated English Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Direct link

Jiyeo Yun – English Teaching, 2023

Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…

Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring

Automated Assessment of Second Language Comprehensibility: Review, Training, Validation, and Generalization Studies

Peer reviewed

Direct link

Saito, Kazuya; Macmillan, Konstantinos; Kachlicka, Magdalena; Kunihara, Takuya; Minematsu, Nobuaki – Studies in Second Language Acquisition, 2023

Whereas many scholars have emphasized the relative importance of "comprehensibility" as an ecologically valid goal for L2 speech training, testing, and development, eliciting listeners' judgments is time-consuming. Following calls for research on more efficient L2 speech rating methods in applied linguistics, and growing attention toward…

Descriptors: Second Language Learning, Second Language Instruction, Interrater Reliability, Speech Communication

Accuracy and Reliability of Large Language Models in Assessing Learning Outcomes Achievement across Cognitive Domains

Peer reviewed

Direct link

Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…

Descriptors: Accuracy, Reliability, Computational Linguistics, Standards

Examining the Interrater Reliability between Self- and Teacher Assessment of Students' Oral Performances

Peer reviewed
PDF on ERIC

Download full text

Manzano, Dexter L. – International Journal of Language Testing, 2022

The increasing popularity of self-assessment prompted several scholars to investigate its effectiveness and accuracy in relation to teacher assessment. However, most of these studies focused only on the consistency estimate perspective. Thus, the current study investigated the interrater reliability between self- and teacher assessment of…

Descriptors: Oral Language, Self Evaluation (Individuals), College Students, Interrater Reliability

Students' Use of Formalisations for Improved Logical Reasoning

Peer reviewed

Direct link

Bronkhorst, Hugo; Roorda, Gerrit; Suhre, Cor; Goedhart, Martin – Research in Mathematics Education, 2022

Logical reasoning as part of critical thinking is becoming more and more important to prepare students for their future life in society, work, and study. This article presents the results of a quasi-experimental study with a pre-test-post-test control group design focusing on the effective use of formalisations to support logical reasoning. The…

Descriptors: Mathematics Instruction, Teaching Methods, Logical Thinking, Critical Thinking

Using Clinical Simulation to Assess MSW Students' Engagement Skills

Peer reviewed

Direct link

Sacristan, Dolly; Martinez, Colleen D. – Journal of Teaching in Social Work, 2023

Social work educators are compelled to use reliable and valid methods to assess student learning outcomes. This study adapted a clinical simulation by integrating traditional role-play of case scenarios and elements of the Objective Structured Clinical Examination, which is often used to assess students' practice skills. Master of Social Work…

Descriptors: Graduate Students, Counselor Training, Masters Programs, Clinical Experience

Exploring Differences in Measurement and Reporting of Classroom Observation Inter-Rater Reliability

Peer reviewed
PDF on ERIC

Download full text

Wilhelm, Anne Garrison; Gillespie Rouse, Amy; Jones, Francesca – Practical Assessment, Research & Evaluation, 2018

Although inter-rater reliability is an important aspect of using observational instruments, it has received little theoretical attention. In this article, we offer some guidance for practitioners and consumers of classroom observations so that they can make decisions about inter-rater reliability, both for study design and in the reporting of data…

Descriptors: Interrater Reliability, Measurement, Observation, Educational Research

Assessing Language in Unstructured Conversation in People with Aphasia: Methods, Psychometric Integrity, Normative Data, and Comparison to a Structured Narrative Task

Peer reviewed

Direct link

Leaman, Marion C.; Edmonds, Lisa A. – Journal of Speech, Language, and Hearing Research, 2021

Purpose: This study evaluated interrater reliability (IRR) and test-retest stability (TRTS) of seven linguistic measures (percent correct information units, relevance, subject-verb-[object], complete utterance, grammaticality, referential cohesion, global coherence), and communicative success in unstructured conversation and in a story narrative…

Descriptors: Aphasia, Psychometrics, Correlation, Speech Language Pathology

Estimating Hazard Ratios from Published Kaplan-Meier Survival Curves: A Methods Validation Study

Peer reviewed

Direct link

Saluja, Ronak; Cheng, Sierra; delos Santos, Keemo Althea; Chan, Kelvin K. W. – Research Synthesis Methods, 2019

Objective: Various statistical methods have been developed to estimate hazard ratios (HRs) from published Kaplan-Meier (KM) curves for the purpose of performing meta-analyses. The objective of this study was to determine the reliability, accuracy, and precision of four commonly used methods by Guyot, Williamson, Parmar, and Hoyle and Henley.…

Descriptors: Meta Analysis, Reliability, Accuracy, Randomized Controlled Trials

Effect of Quality Characteristics of Peer Raters on Rating Errors in Peer Assessment

Peer reviewed

Direct link

Guo, Xiuyan; Lei, Pui-Wa – International Journal of Testing, 2020

Little research has been done on the effects of peer raters' quality characteristics on peer rating qualities. This study aims to address this gap and investigate the effects of key variables related to peer raters' qualities, including content knowledge, previous rating experience, training on rating tasks, and rating motivation. In an experiment…

Descriptors: Peer Evaluation, Error Patterns, Correlation, Knowledge Level

Using Subjective and Objective Measures to Predict Level of Reading Fluency at the End of First Grade

Peer reviewed

Direct link

Morris, Darrell; Pennell, Ashley M.; Perney, Jan; Trathen, Woodrow – Reading Psychology, 2018

This study compared reading rate to reading fluency (as measured by a rating scale). After listening to first graders read short passages, we assigned an overall fluency rating (low, average, or high) to each reading. We then used predictive discriminant analyses to determine which of five measures--accuracy, rate (objective); accuracy, phrasing,…

Descriptors: Reading Fluency, Prediction, Grade 1, Elementary School Students

The Impact of Rater Variability on Relationships among Different Effect-Size Indices for Inter-Rater Agreement between Human and Automated Essay Scoring

Direct link

Yun, Jiyeo – ProQuest LLC, 2017

Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used…

Descriptors: Interrater Reliability, Essays, Scoring, Evaluators

Validity and Reliability of Student Perceptions of Teaching Quality in Primary Education

Peer reviewed

Direct link

van der Scheer, Emmelien A.; Bijlsma, Hannah J. E.; Glas, Cees A. W. – School Effectiveness and School Improvement, 2019

A Bayesian IRT-model approach was used to investigate the validity and reliability of student perceptions of teaching quality. Furthermore, the student perceptions were compared with ratings of teaching quality by external observers. Grade 4 students (n = 675) filled out a questionnaire that was used to measure their opinions about the lessons of…

Descriptors: Student Attitudes, Validity, Interrater Reliability, Correlation

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

ETS Research Report Series	3
Advances in Health Sciences…	2
Creativity Research Journal	2
Journal of Speech, Language,…	2
Research Synthesis Methods	2
Active Learning in Higher…	1
Advances in Physiology…	1
American Journal of…	1
Applied Measurement in…	1
Autism: The International…	1
Canadian Journal of Learning…	1
Canadian Modern Language…	1
Child Care in Practice	1
College & Research Libraries	1
Contributions to Music…	1
Developmental Psychology	1
Early Child Development and…	1
Early Years: An International…	1
Educational Management…	1
Educational Research and…	1
Educational Research and…	1
Educational Review	1
Educational Sciences: Theory…	1
Educational and Psychological…	1
English Teaching	1
More ▼

Coniam, David	3
Abrami, Philip C.	1
Alsma, Jelmer	1
Amanda Huee-Ping Wong	1
Ames, Catherine	1
Arnold, Mariah	1
Ash, Ivan K.	1
Attali, Yigal	1
Baker-Henningham, Helen	1
Barclay, Alexandra	1
Beare, Paul	1
Bene, Edina	1
Bermúdez, María Olga Escandell	1
Bijlsma, Hannah J. E.	1
Bilginer, Hayriye	1
Bolaños, Daniel	1
Bolton, Patrick	1
Brady, Nancy C.	1
Breyer, F. Jay	1
Bronkhorst, Hugo	1
Browne, Dillon T.	1
Bruce L. Sherin	1
Bures, Eva Mary	1
Chan, Kelvin K. W.	1
More ▼