Publication Date
In 2025 | 6 |
Since 2024 | 7 |
Since 2021 (last 5 years) | 17 |
Since 2016 (last 10 years) | 36 |
Since 2006 (last 20 years) | 43 |
Descriptor
Evaluators | 44 |
Second Language Instruction | 44 |
Writing Evaluation | 44 |
Second Language Learning | 42 |
English (Second Language) | 40 |
Foreign Countries | 28 |
Essays | 24 |
College Students | 16 |
Teaching Methods | 16 |
Comparative Analysis | 15 |
Scoring Rubrics | 15 |
More ▼ |
Source
Author
Ghanbari, Nasim | 2 |
Abbasi, Abbas | 1 |
Ahmet Can Uyar | 1 |
Ahn, Soojin | 1 |
Ait Hammou, Brahim | 1 |
Akki, Fouad | 1 |
Aksu Ataç, Bengü | 1 |
Al-Hattami, Abdulghani A. | 1 |
Apichat Khamboonruang | 1 |
Aryadoust, Vahid | 1 |
Baker, Paul | 1 |
More ▼ |
Publication Type
Journal Articles | 43 |
Reports - Research | 40 |
Tests/Questionnaires | 7 |
Information Analyses | 2 |
Dissertations/Theses -… | 1 |
Education Level
Higher Education | 34 |
Postsecondary Education | 29 |
Secondary Education | 3 |
High Schools | 1 |
Audience
Location
Turkey | 6 |
Japan | 4 |
Thailand | 3 |
China | 2 |
Europe | 2 |
Indonesia | 2 |
Iran | 2 |
South Korea | 2 |
Australia | 1 |
Belgium | 1 |
California | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
International English… | 2 |
Flesch Kincaid Grade Level… | 1 |
What Works Clearinghouse Rating
Fatih Yavuz; Özgür Çelik; Gamze Yavas Çelik – British Journal of Educational Technology, 2025
This study investigates the validity and reliability of generative large language models (LLMs), specifically ChatGPT and Google's Bard, in grading student essays in higher education based on an analytical grading rubric. A total of 15 experienced English as a foreign language (EFL) instructors and two LLMs were asked to evaluate three student…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Computational Linguistics
Taichi Yamashita – Language Testing, 2025
With the rapid development of generative artificial intelligence (AI) frameworks (e.g., the generative pre-trained transformer [GPT]), a growing number of researchers have started to explore its potential as an automated essay scoring (AES) system. While previous studies have investigated the alignment between human ratings and GPT ratings, few…
Descriptors: Artificial Intelligence, English (Second Language), Second Language Learning, Second Language Instruction
Jiyeo Yun – English Teaching, 2023
Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…
Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring
Lian Li; Jiehui Hu; Yu Dai; Ping Zhou; Wanhong Zhang – Reading & Writing Quarterly, 2024
This paper proposes to use depth perception to represent raters' decision in holistic evaluation of ESL essays, as an alternative medium to conventional form of numerical scores. The researchers verified the new method's accuracy and inter/intra-rater reliability by inviting 24 ESL teachers to perform different representations when rating 60…
Descriptors: Essays, Holistic Approach, Writing Evaluation, Accuracy
Ait Hammou, Brahim; Larouz, Mohammed; Fagroud, Mustapha; Akki, Fouad – Canadian Journal of Applied Linguistics / Revue canadienne de linguistique appliquée, 2023
This study aims to examine the relationship between the productive knowledge of some lexical and phraseological indices and the quality of English as a Foreign Language (EFL) learners' writing. A sample of 120 expository essays, written by semesters 1 and 5 university students in a less proficient EFL context, are rated by human evaluators and…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Writing Instruction
Ahmet Can Uyar; Dilek Büyükahiska – International Journal of Assessment Tools in Education, 2025
This study explores the effectiveness of using ChatGPT, an Artificial Intelligence (AI) language model, as an Automated Essay Scoring (AES) tool for grading English as a Foreign Language (EFL) learners' essays. The corpus consists of 50 essays representing various types including analysis, compare and contrast, descriptive, narrative, and opinion…
Descriptors: Artificial Intelligence, Computer Software, Technology Uses in Education, Teaching Methods
Li, Wentao – Reading and Writing: An Interdisciplinary Journal, 2022
Scoring rubrics are known to be effective for assessing writing for both testing and classroom teaching purposes. How raters interpret the descriptors in a rubric can significantly impact the subsequent final score, and further, the descriptors may also color a rater's judgment of a student's writing quality. Little is known, however, about how…
Descriptors: Scoring Rubrics, Interrater Reliability, Writing Evaluation, Teaching Methods
Vasfiye Geçkin; Ebru Kiziltas; Çagatay Çinar – Journal of Educational Technology and Online Learning, 2023
The quality of writing in a second language (L2) is one of the indicators of the level of proficiency for many college students to be eligible for departmental studies. Although certain software programs, such as Intelligent Essay Assessor or IntelliMetric, have been introduced to evaluate second-language writing quality, an overall assessment of…
Descriptors: Writing Evaluation, Second Language Learning, Second Language Instruction, Language Proficiency
Apichat Khamboonruang – PASAA: Journal of Language Teaching and Learning in Thailand, 2023
Differential rater severity (DRS), one prevalent case of differential rater functioning (aka rater bias or rater interaction) effects, manifests itself when a rater assigns unusually severe or lenient ratings, threatening the validity and fairness of rater-mediated assessment. Building on a many-facets Rasch measurement (MFRM) approach, this study…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Scoring Rubrics
Heidari, Nasim; Ghanbari, Nasim; Abbasi, Abbas – Language Testing in Asia, 2022
It is widely believed that human rating performance is influenced by an array of different factors. Among these, rater-related variables such as experience, language background, perceptions, and attitudes have been mentioned. One of the important rater-related factors is the way the raters interact with the rating scales. In particular, how raters…
Descriptors: Evaluators, Rating Scales, Language Tests, English (Second Language)
Xiaoling Bai; Nur Rasyidah Mohd Nordin – Eurasian Journal of Applied Linguistics, 2025
A perfect writing skill has been deemed instrumental to achieving competence in EFL, yet it is considered one of the most impressive learning domains. This study investigates the impact of human-AI collaborative feedback on the writing proficiency of EFL students. It examines key teaching domains, including the teaching environment, teacher…
Descriptors: Artificial Intelligence, Feedback (Response), Evaluators, Writing Skills
Junifer Leal Bucol; Napattanissa Sangkawong – Innovations in Education and Teaching International, 2025
This research paper employs an exploratory framework to evaluate the potential of ChatGPT as an Automated Writing Evaluation (AWE) tool in teaching English as a Foreign Language (EFL) in Thailand. The main objective is to investigate how well ChatGPT can assess students' writing using prompts and pre-defined rubrics compared to human raters.…
Descriptors: Artificial Intelligence, Computer Software, Teaching Methods, English (Second Language)
Reading Matrix: An International Online Journal, 2025
This systematic review examines 22 studies (2024-2025) on the use of generative AI, primarily ChatGPT, for providing feedback in English writing instruction for language learners. It identifies the types of feedback AI offers, its effectiveness relative to teacher and peer feedback, and perceptions from students and teachers. Findings show AI…
Descriptors: Writing Instruction, Teaching Methods, English (Second Language), Second Language Learning
Ghanbari, Nasim; Barati, Hossein – Language Testing in Asia, 2020
The present study reports the process of development and validation of a rating scale in the Iranian EFL academic writing assessment context. To achieve this goal, the study was conducted in three distinct phases. Early in the study, the researcher interviewed a number of raters in different universities. Next, a questionnaire was developed based…
Descriptors: Rating Scales, Writing Evaluation, English for Academic Purposes, Second Language Learning
Wu, Xuefeng – English Language Teaching, 2022
Rating scales for writing assessment are critical in that they determine directly the quality and fairness of such performance tests. However, in many EFL contexts, rating scales are made, to certain extent, based on the intuition of teachers who strongly need a feasible and scientific route to guide their construction of rating scales. This study…
Descriptors: Writing Evaluation, Rating Scales, Second Language Learning, Second Language Instruction