ERIC - Search Results

Publication Date

In 2025	2
Since 2024	5
Since 2021 (last 5 years)	17
Since 2016 (last 10 years)	38
Since 2006 (last 20 years)	53

Descriptor

Comparative Analysis	57
Evaluators	57
Scores	57
English (Second Language)	25
Second Language Learning	24
Foreign Countries	20
Language Tests	17
Essays	12
Computer Software	11
Second Language Instruction	11
Writing Evaluation	11
Pronunciation	10
Correlation	9
Oral Language	9
Statistical Analysis	9
Accuracy	8
Evaluation Methods	8
Speech Communication	8
Audio Equipment	7
Computer Assisted Testing	7
Interrater Reliability	7
Scoring	7
Artificial Intelligence	6
Computational Linguistics	6
Language Fluency	6
More ▼

Publication Type

Journal Articles	47
Reports - Research	45
Tests/Questionnaires	7
Dissertations/Theses -…	6
Reports - Evaluative	6
Speeches/Meeting Papers	3

Education Level

Higher Education	18
Postsecondary Education	14
Secondary Education	4
Elementary Secondary Education	2
High Schools	2
Grade 12	1
Middle Schools	1

Audience

Location

Turkey	3
Canada	2
China	2
Iran	2
Netherlands	2
Australia	1
Austria	1
Brazil	1
Europe	1
Florida	1
Israel	1
Japan	1
Mississippi	1
North America	1
Pakistan	1
Poland	1
Singapore	1
South Carolina	1
Texas	1
United Kingdom	1
United States	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

International English…	2
National Assessment of…	1
Test of English as a Foreign…	1
Test of English for…	1
United States Medical…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 57 results Save | Export

Implicit versus Explicit First Impressions in Performance-Based Assessment: Will Raters Overcome Their First Impressions When Learner Performance Changes?

Peer reviewed

Direct link

Timothy J. Wood; Vijay J. Daniels; Debra Pugh; Claire Touchie; Samantha Halman; Susan Humphrey-Murto – Advances in Health Sciences Education, 2024

First impressions can influence rater-based judgments but their contribution to rater bias is unclear. Research suggests raters can overcome first impressions in experimental exam contexts with explicit first impressions, but these findings may not generalize to a workplace context with implicit first impressions. The study had two aims. First, to…

Descriptors: Evaluators, Work Environment, Decision Making, Video Technology

Grading the Graders: Comparing Generative AI and Human Assessment in Essay Evaluation

Peer reviewed

Direct link

Elizabeth L. Wetzler; Kenneth S. Cassidy; Margaret J. Jones; Chelsea R. Frazier; Nickalous A. Korbut; Chelsea M. Sims; Shari S. Bowen; Michael Wood – Teaching of Psychology, 2025

Background: Generative artificial intelligence (AI) represents a potentially powerful, time-saving tool for grading student essays. However, little is known about how AI-generated essay scores compare to human instructor scores. Objective: The purpose of this study was to compare the essay grading scores produced by AI with those of human…

Descriptors: Essays, Writing Evaluation, Scores, Evaluators

The Relationship between Kenexa Scores and the Texas Teacher Evaluation System

Direct link

Christopher D. Daniel – ProQuest LLC, 2024

Districts spend thousands of dollars on computerized teacher screeners without knowing if they are identifying the most effective teacher. Hiring quality staff is one of the most important job functions of a principal, and many times a teacher screener score may eliminate an effective teacher. The current study examined the value of teacher…

Descriptors: Teacher Evaluation, Scores, Screening Tests, Teacher Effectiveness

Effects of Using Double Ratings as Item Scores on IRT Proficiency Estimation

Peer reviewed

Direct link

Song, Yoon Ah; Lee, Won-Chan – Applied Measurement in Education, 2022

This article presents the performance of item response theory (IRT) models when double ratings are used as item scores over single ratings when rater effects are present. Study 1 examined the influence of the number of ratings on the accuracy of proficiency estimation in the generalized partial credit model (GPCM). Study 2 compared the accuracy of…

Descriptors: Item Response Theory, Item Analysis, Scores, Accuracy

Graders of the Future: Comparing the Consistency and Accuracy of GPT4 and Pre-Service Teachers in Physics Essay Question Assessments

Peer reviewed
PDF on ERIC

Download full text

Yubin Xu; Lin Liu; Jianwen Xiong; Guangtian Zhu – Journal of Baltic Science Education, 2025

As the development and application of large language models (LLMs) in physics education progress, the well-known AI-based chatbot ChatGPT4 has presented numerous opportunities for educational assessment. Investigating the potential of AI tools in practical educational assessment carries profound significance. This study explored the comparative…

Descriptors: Physics, Artificial Intelligence, Computer Software, Accuracy

More Efficient Processes for Creating Automated Essay Scoring Frameworks: A Demonstration of Two Algorithms

Peer reviewed

Direct link

Shin, Jinnie; Gierl, Mark J. – Language Testing, 2021

Automated essay scoring (AES) has emerged as a secondary or as a sole marker for many high-stakes educational assessments, in native and non-native testing, owing to remarkable advances in feature engineering using natural language processing, machine learning, and deep-neural algorithms. The purpose of this study is to compare the effectiveness…

Descriptors: Scoring, Essays, Writing Evaluation, Computer Software

Automated Feedback Generation for Student Project Reports: A Data-Driven Approach

Peer reviewed
PDF on ERIC

Download full text

Jia, Qinjin; Young, Mitchell; Xiao, Yunkai; Cui, Jialin; Liu, Chengyuan; Rashid, Parvez; Gehringer, Edward – Journal of Educational Data Mining, 2022

Instant feedback plays a vital role in promoting academic achievement and student success. In practice, however, delivering timely feedback to students can be challenging for instructors for a variety of reasons (e.g., limited teaching resources). In many cases, feedback arrives too late for learners to act on the advice and reinforce their…

Descriptors: Student Projects, Learning Analytics, Intelligent Tutoring Systems, Feedback (Response)

Crowdsourced Adaptive Comparative Judgment: A Community-Based Solution for Proficiency Rating

Peer reviewed

Direct link

Paquot, Magali; Rubin, Rachel; Vandeweerd, Nathan – Language Learning, 2022

The main objective of this Methods Showcase Article is to show how the technique of adaptive comparative judgment, coupled with a crowdsourcing approach, can offer practical solutions to reliability issues as well as to address the time and cost difficulties associated with a text-based approach to proficiency assessment in L2 research. We…

Descriptors: Comparative Analysis, Decision Making, Language Proficiency, Reliability

Comprehensible to Whom? Examining Rater, Speaker, and Interlocutor Perspectives on Comprehensibility in an Interactive Context

Peer reviewed

Direct link

Nagle, Charlie L.; Trofimovich, Pavel; O'Brien, Mary Grantham; Kennedy, Sara – Modern Language Journal, 2022

Comprehensibility has emerged as a useful and intuitive means of globally evaluating second language (L2) speakers in many research and instructional contexts. In most cases, L2 speakers' comprehensibility is assessed by external listeners who do not engage in extensive communication with the speakers, even though the degree to which a speaker is…

Descriptors: Evaluators, Intelligibility, Pronunciation, Task Analysis

A Model-Data-Fit-Informed Approach to Score Resolution in Performance Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021

Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…

Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making

Mitigating Gender and L1 Biases in Automated English Speaking Assessment

Direct link

Alexander James Kwako – ProQuest LLC, 2023

Automated assessment using Natural Language Processing (NLP) has the potential to make English speaking assessments more reliable, authentic, and accessible. Yet without careful examination, NLP may exacerbate social prejudices based on gender or native language (L1). Current NLP-based assessments are prone to such biases, yet research and…

Descriptors: Gender Bias, Natural Language Processing, Native Language, Computational Linguistics

Assessing Second-Language Academic Writing: AI vs. Human Raters

Peer reviewed
PDF on ERIC

Download full text

Vasfiye Geçkin; Ebru Kiziltas; Çagatay Çinar – Journal of Educational Technology and Online Learning, 2023

The quality of writing in a second language (L2) is one of the indicators of the level of proficiency for many college students to be eligible for departmental studies. Although certain software programs, such as Intelligent Essay Assessor or IntelliMetric, have been introduced to evaluate second-language writing quality, an overall assessment of…

Descriptors: Writing Evaluation, Second Language Learning, Second Language Instruction, Language Proficiency

Automated Speech Scoring of Dialogue Response by Japanese Learners of English as a Foreign Language

Peer reviewed

Direct link

Yuko Hayashi; Yusuke Kondo; Yutaka Ishii – Innovation in Language Learning and Teaching, 2024

Purpose: This study builds a new system for automatically assessing learners' speech elicited from an oral discourse completion task (DCT), and evaluates the prediction capability of the system with a view to better understanding factors deemed influential in predicting speaking proficiency scores and the pedagogical implications of the system.…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Japanese

The Dual Personality of 'Topic' in the IELTS Speaking Test

Peer reviewed

Direct link

Seedhouse, Paul – ELT Journal, 2019

This article investigates the central role of topic in the IELTS Speaking Test (IST). Topic has developed a dual personality in this interactional setting: topic-as-script is the scripted statement of topic on the examiner's cards prior to the interaction, whereas topic-as-action is how topic is developed by the candidate during the course of the…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Personality Traits

Comparatively Salient: Examining the Influence of Preceding Performances on Assessors' Focus and Interpretations in Written Assessment Comments

Peer reviewed

Direct link

Gingerich, Andrea; Schokking, Edward; Yeates, Peter – Advances in Health Sciences Education, 2018

Recent literature places more emphasis on assessment comments rather than relying solely on scores. Both are variable, however, emanating from assessment judgements. One established source of variability is "contrast effects": scores are shifted away from the depicted level of competence in a preceding encounter. The shift could arise…

Descriptors: Evaluation Methods, Scores, Intervention, Schemata (Cognition)

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Language Testing	6
ProQuest LLC	6
Language Assessment Quarterly	3
Advances in Health Sciences…	2
ETS Research Report Series	2
Language Testing in Asia	2
TESOL Quarterly: A Journal…	2
Advances in Language and…	1
Applied Linguistics	1
Applied Measurement in…	1
Assessment for Effective…	1
Athletic Training Education…	1
Clinical Linguistics &…	1
College Student Journal	1
Computer Assisted Language…	1
Contemporary Educational…	1
ELT Journal	1
Educational Measurement:…	1
Educational Sciences: Theory…	1
Educational and Psychological…	1
English Language Teaching	1
Evaluation and the Health…	1
Health Education Journal	1
Innovation in Language…	1
International Journal of…	1
More ▼

Attali, Yigal	2
Ahmadi, Alireza	1
Alexander James Kwako	1
Baldwin, Peter	1
Beaudin, Barbara	1
Bowler, Mark C.	1
Brannen, Kathleen	1
Breyer, F. Jay	1
Canfield, Allison R.	1
Cardoso, Walcir	1
Chafouleas, Sandra M.	1
Chelsea M. Sims	1
Chelsea R. Frazier	1
Christopher D. Daniel	1
Chukharev-Hudilainen, Evgeny	1
Claire Touchie	1
Clauser, Jerome C.	1
Cui, Jialin	1
Dai, David Wei	1
Davis, Stephen	1
DeCarlo, Lawrence T.	1
Debra Pugh	1
Dutka, Lukasz	1
Ebru Kiziltas	1
Eigsti, Inge-Marie	1
More ▼