ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	9
Since 2017 (last 10 years)	15
Since 2007 (last 20 years)	22

Descriptor

Computer Software	22
Correlation	22
Evaluators	22
Second Language Learning	14
English (Second Language)	11
Scoring	10
Comparative Analysis	9
Writing Evaluation	9
Computational Linguistics	8
Foreign Countries	8
Language Tests	8
Artificial Intelligence	7
Second Language Instruction	7
Computer Assisted Testing	6
Essays	6
Interrater Reliability	6
Scores	6
Undergraduate Students	6
Speech Communication	4
Writing Tests	4
Accuracy	3
Audio Equipment	3
Evaluation Criteria	3
Evaluation Methods	3
Grading	3
More ▼

Source

ETS Research Report Series	3
Grantee Submission	2
ProQuest LLC	2
Advances in Physiology…	1
Computer Assisted Language…	1
Creativity Research Journal	1
Education and Information…	1
International Educational…	1
Journal of Baltic Science…	1
Journal of Educational…	1
Journal of Speech, Language,…	1
Language Learning & Technology	1
Language Teaching Research…	1
Language Testing	1
Psychological Assessment	1
Research-publishing.net	1
Studies in Second Language…	1
TESOL Quarterly: A Journal…	1
More ▼

Publication Type

Reports - Research	18
Journal Articles	16
Speeches/Meeting Papers	3
Tests/Questionnaires	3
Dissertations/Theses -…	2
Reports - Evaluative	2

Education Level

Higher Education	8
Postsecondary Education	7
Elementary Education	1
High Schools	1
Secondary Education	1

Audience

Location

China	2
Canada	1
Japan	1
Singapore	1
Turkey	1
United Kingdom (London)	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	3
Big Five Inventory	1
Gates MacGinitie Reading Tests	1
International English…	1
Torrance Tests of Creative…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 22 results Save | Export

Graders of the Future: Comparing the Consistency and Accuracy of GPT4 and Pre-Service Teachers in Physics Essay Question Assessments

Peer reviewed
PDF on ERIC

Download full text

Yubin Xu; Lin Liu; Jianwen Xiong; Guangtian Zhu – Journal of Baltic Science Education, 2025

As the development and application of large language models (LLMs) in physics education progress, the well-known AI-based chatbot ChatGPT4 has presented numerous opportunities for educational assessment. Investigating the potential of AI tools in practical educational assessment carries profound significance. This study explored the comparative…

Descriptors: Physics, Artificial Intelligence, Computer Software, Accuracy

The Intersection of AI and Language Assessment: A Study on the Reliability of ChatGPT in Grading IELTS Writing Task 2

Peer reviewed
PDF on ERIC

Download full text

Osama Koraishi – Language Teaching Research Quarterly, 2024

This study conducts a comprehensive quantitative evaluation of OpenAI's language model, ChatGPT 4, for grading Task 2 writing of the IELTS exam. The objective is to assess the alignment between ChatGPT's grading and that of official human raters. The analysis encompassed a multifaceted approach, including a comparison of means and reliability…

Descriptors: Second Language Learning, English (Second Language), Language Tests, Artificial Intelligence

Evaluating Creativity: How Idea Context and Rater Personality Affect Considerations of Novelty and Usefulness

Peer reviewed

Direct link

Lloyd-Cox, James; Pickering, Alan; Bhattacharya, Joydeep – Creativity Research Journal, 2022

According to the standard definition, creative ideas must be both novel and useful. While a handful of recent studies suggest that novelty is more important than usefulness to evaluations of creativity, little is known about the contextual and interpersonal factors that affect how people weigh these two components when making an overall creativity…

Descriptors: Creativity, Personality Traits, Decision Making, Evaluators

Measuring Original Thinking in Elementary School: Development and Validation of a Computational Psychometric Approach

Peer reviewed

Direct link

Selcuk Acar; Denis Dumas; Peter Organisciak; Kelly Berthiaume – Grantee Submission, 2024

Creativity is highly valued in both education and the workforce, but assessing and developing creativity can be difficult without psychometrically robust and affordable tools. The open-ended nature of creativity assessments has made them difficult to score, expensive, often imprecise, and therefore impractical for school- or district-wide use. To…

Descriptors: Thinking Skills, Elementary School Students, Artificial Intelligence, Measurement Techniques

Assessing Second-Language Academic Writing: AI vs. Human Raters

Peer reviewed
PDF on ERIC

Download full text

Vasfiye Geçkin; Ebru Kiziltas; Çagatay Çinar – Journal of Educational Technology and Online Learning, 2023

The quality of writing in a second language (L2) is one of the indicators of the level of proficiency for many college students to be eligible for departmental studies. Although certain software programs, such as Intelligent Essay Assessor or IntelliMetric, have been introduced to evaluate second-language writing quality, an overall assessment of…

Descriptors: Writing Evaluation, Second Language Learning, Second Language Instruction, Language Proficiency

Accuracy and Reliability of Large Language Models in Assessing Learning Outcomes Achievement across Cognitive Domains

Peer reviewed

Direct link

Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…

Descriptors: Accuracy, Reliability, Computational Linguistics, Standards

Perceptual and Acoustic Assessment of Strain Using Synthetically Modified Voice Samples

Peer reviewed

Direct link

Park, Yeonggwang; Cádiz, Manuel Díaz; Nagle, Kathleen F.; Stepp, Cara E. – Journal of Speech, Language, and Hearing Research, 2020

Purpose: Assessment of strained voice quality is difficult due to the weak reliability of auditory-perceptual evaluation and lack of strong acoustic correlates. This study evaluated the contributions of relative fundamental frequency (RFF) and mid-to-high frequency noise to the perception of strain. Method: Stimuli were created using recordings of…

Descriptors: Acoustics, Audio Equipment, Auditory Perception, Correlation

Interactional Features of L2 Pragmatic Interaction in Role-Play Speaking Assessment

Peer reviewed

Direct link

Youn, Soo Jung – TESOL Quarterly: A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second Dialect, 2020

This study explicates the nature of second language (L2) pragmatic interaction focusing on the quantitative function of interactional features. A relationship between the fine-grained interactional features elicited from learners' role-play performances at varying levels and trained raters' scores was investigated. The corpus of 102 learners'…

Descriptors: Role Playing, Pragmatics, Speech Communication, Computational Linguistics

Using Google Voice Typing to Automatically Assess Pronunciation

Peer reviewed
PDF on ERIC

Download full text

Johnson, Carol; Cardoso, Walcir; Zuercher, Beau; Brannen, Kathleen; Springer, Suzanne – Research-publishing.net, 2022

This study examined the use of a popular Automatic Speech Recognition (ASR), Google Voice Typing (GVT), to automatically assess English as second language pronunciation. It aimed to answer the following question: What is the relationship between GVT-rated scores and human-rated scores? To answer this question, we compared audio recordings of 56…

Descriptors: Teaching Methods, Computer Software, Pronunciation, Second Language Learning

Can Automated Machine Translation Evaluation Metrics Be Used to Assess Students' Interpretation in the Language Learning Classroom?

Peer reviewed

Direct link

Han, Chao; Lu, Xiaolei – Computer Assisted Language Learning, 2023

The use of translation and interpreting (T&I) in the language learning classroom is commonplace, serving various pedagogical and assessment purposes. Previous utilization of T&I exercises is driven largely by their potential to enhance language learning, whereas the latest trend has begun to underscore T&I as a crucial skill to be…

Descriptors: Translation, Computational Linguistics, Correlation, Language Processing

The Use of Semantic Similarity Tools in Automated Content Scoring of Fact-Based Essays Written by EFL Learners

Peer reviewed

Direct link

Wang, Qiao – Education and Information Technologies, 2022

This study searched for open-source semantic similarity tools and evaluated their effectiveness in automated content scoring of fact-based essays written by English-as-a-Foreign-Language (EFL) learners. Fifty writing samples under a fact-based writing task from an academic English course in a Japanese university were collected and a gold standard…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Scoring

Grading Emails and Generating Feedback

Peer reviewed
PDF on ERIC

Download full text

Unnam, Abhishek; Takhar, Rohit; Aggarwal, Varun – International Educational Data Mining Society, 2019

Email has become the most preferred form of business communication. Writing "good" email has become an essential skill required in the industry. "Good" email writing not only facilitates clear communication, but also makes a positive impression on the recipient, whether it be one's colleague or a customer. The aim of this paper…

Descriptors: Grading, Electronic Mail, Feedback (Response), Written Language

The Impact of Rater Variability on Relationships among Different Effect-Size Indices for Inter-Rater Agreement between Human and Automated Essay Scoring

Direct link

Yun, Jiyeo – ProQuest LLC, 2017

Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used…

Descriptors: Interrater Reliability, Essays, Scoring, Evaluators

Toward a Dynamic View of Second Language Comprehensibility

Peer reviewed

Direct link

Nagle, Charles; Trofimovich, Pavel; Bergeron, Annie – Studies in Second Language Acquisition, 2019

This study took a dynamic approach to second language (L2) comprehensibility, examining how listeners construct comprehensibility profiles for L2 Spanish speakers during the listening task and what features enhance or diminish comprehensibility. Listeners were 24 native Spanish speakers who evaluated 2-5 minute audio clips recorded by three…

Descriptors: Second Language Learning, Profiles, Spanish, Listening Comprehension

Exploring Relationships between Automated and Human Evaluations of L2 Texts

Peer reviewed

Direct link

Matthews, Joshua; Wijeyewardene, Ingrid – Language Learning & Technology, 2018

Despite the current potential to use computers to automatically generate a large range of text-based indices, many issues remain unresolved about how to apply these data in established language teaching and assessment contexts. One way to resolve these issues is to explore the degree to which automatically generated indices, which are reflective…

Descriptors: Correlation, Robotics, Second Language Learning, Second Language Instruction

Previous Page | Next Page »

Pages: 1 | 2

Bridgeman, Brent	2
Aggarwal, Varun	1
Amanda Huee-Ping Wong	1
Bantum, Erin O'Carroll	1
Bergeron, Annie	1
Bhattacharya, Joydeep	1
Brannen, Kathleen	1
Breyer, F. Jay	1
Cardoso, Walcir	1
Cox, Troy L.	1
Crossley, Scott A.	1
Cádiz, Manuel Díaz	1
Davey, Tim	1
Denis Dumas	1
Ebru Kiziltas	1
Gentile, Claudia	1
Guangtian Zhu	1
Han, Chao	1
Ivan Cherh Chiet Low	1
Jianwen Xiong	1
Johnson, Carol	1
Kantor, Robert	1
Kelly Berthiaume	1
Lee, Yong-Won	1
Lin Liu	1
More ▼