Publication Date
In 2025 | 3 |
Descriptor
Author
Ahmet Can Uyar | 1 |
Alex J. Mechaber | 1 |
Ayaka Sugawara | 1 |
Brian E. Clauser | 1 |
Dilek Büyükahiska | 1 |
Kai North | 1 |
Le An Ha | 1 |
Naho Orita | 1 |
Peter Baldwin | 1 |
Qiao Wang | 1 |
Ralph L. Rose | 1 |
More ▼ |
Publication Type
Journal Articles | 3 |
Reports - Research | 3 |
Education Level
Higher Education | 2 |
Postsecondary Education | 2 |
Secondary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
International English… | 1 |
What Works Clearinghouse Rating
Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025
Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…
Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy
Ahmet Can Uyar; Dilek Büyükahiska – International Journal of Assessment Tools in Education, 2025
This study explores the effectiveness of using ChatGPT, an Artificial Intelligence (AI) language model, as an Automated Essay Scoring (AES) tool for grading English as a Foreign Language (EFL) learners' essays. The corpus consists of 50 essays representing various types including analysis, compare and contrast, descriptive, narrative, and opinion…
Descriptors: Artificial Intelligence, Computer Software, Technology Uses in Education, Teaching Methods
Qiao Wang; Ralph L. Rose; Ayaka Sugawara; Naho Orita – Vocabulary Learning and Instruction, 2025
VocQGen is an automated tool designed to generate multiple-choice cloze (MCC) questions for vocabulary assessment in second language learning contexts. It leverages several natural language processing (NLP) tools and OpenAI's GPT-4 model to produce MCC items quickly from user-specified word lists. To evaluate its effectiveness, we used the first…
Descriptors: Vocabulary Skills, Artificial Intelligence, Computer Software, Multiple Choice Tests