Publication Date
| In 2026 | 0 |
| Since 2025 | 7 |
| Since 2022 (last 5 years) | 40 |
| Since 2017 (last 10 years) | 78 |
| Since 2007 (last 20 years) | 125 |
Descriptor
| Computer Software | 172 |
| Scoring | 172 |
| Computer Assisted Testing | 75 |
| Foreign Countries | 43 |
| Essays | 39 |
| Writing Evaluation | 38 |
| Second Language Learning | 36 |
| Artificial Intelligence | 35 |
| Comparative Analysis | 33 |
| English (Second Language) | 32 |
| Evaluation Methods | 29 |
| More ▼ | |
Source
Author
| Attali, Yigal | 3 |
| Heffernan, Neil | 3 |
| McNamara, Danielle S. | 3 |
| Shermis, Mark D. | 3 |
| Andreea Dutulescu | 2 |
| Baral, Sami | 2 |
| Breyer, F. Jay | 2 |
| Bridgeman, Brent | 2 |
| Burstein, Jill | 2 |
| Chung, Gregory K. W. K. | 2 |
| Crossley, Scott A. | 2 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 5 |
| Practitioners | 3 |
| Teachers | 3 |
| Administrators | 2 |
Location
| Australia | 6 |
| Japan | 5 |
| Netherlands | 5 |
| United Kingdom | 4 |
| China | 3 |
| Germany | 3 |
| South Korea | 3 |
| Belgium | 2 |
| Brazil | 2 |
| Canada | 2 |
| Chile | 2 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Selcuk Acar; Peter Organisciak; Denis Dumas – Journal of Creative Behavior, 2025
In this three-study investigation, we applied various approaches to score drawings created in response to both Form A and Form B of the Torrance Tests of Creative Thinking-Figural (broadly TTCT-F) as well as the Multi-Trial Creative Ideation task (MTCI). We focused on TTCT-F in Study 1, and utilizing a random forest classifier, we achieved 79% and…
Descriptors: Scoring, Computer Assisted Testing, Models, Correlation
Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025
Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…
Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy
Shermis, Mark D. – Journal of Educational Measurement, 2022
One of the challenges of discussing validity arguments for machine scoring of essays centers on the absence of a commonly held definition and theory of good writing. At best, the algorithms attempt to measure select attributes of writing and calibrate them against human ratings with the goal of accurate prediction of scores for new essays.…
Descriptors: Scoring, Essays, Validity, Writing Evaluation
Rebecka Weegar; Peter Idestam-Almquist – International Journal of Artificial Intelligence in Education, 2024
Machine learning methods can be used to reduce the manual workload in exam grading, making it possible for teachers to spend more time on other tasks. However, when it comes to grading exams, fully eliminating manual work is not yet possible even with very accurate automated grading, as any grading mistakes could have significant consequences for…
Descriptors: Grading, Computer Assisted Testing, Introductory Courses, Computer Science Education
Dadi Ramesh; Suresh Kumar Sanampudi – European Journal of Education, 2024
Automatic essay scoring (AES) is an essential educational application in natural language processing. This automated process will alleviate the burden by increasing the reliability and consistency of the assessment. With the advances in text embedding libraries and neural network models, AES systems achieved good results in terms of accuracy.…
Descriptors: Scoring, Essays, Writing Evaluation, Memory
Andreea Dutulescu; Stefan Ruseti; Mihai Dascalu; Danielle S. McNamara – Grantee Submission, 2025
The assessment of student responses to learning-strategy prompts, such as self-explanation, summarization, and paraphrasing, is essential for evaluating cognitive engagement and comprehension. However, manual scoring is resource-intensive, limiting its scalability in educational settings. This study investigates the use of Large Language Models…
Descriptors: Scoring, Computational Linguistics, Computer Software, Artificial Intelligence
Andreea Dutulescu; Stefan Ruseti; Mihai Dascalu; Danielle McNamara – International Educational Data Mining Society, 2025
The assessment of student responses to learning-strategy prompts, such as self-explanation, summarization, and paraphrasing, is essential for evaluating cognitive engagement and comprehension. However, manual scoring is resource-intensive, limiting its scalability in educational settings. This study investigates the use of Large Language Models…
Descriptors: Scoring, Computational Linguistics, Computer Software, Artificial Intelligence
Andrea Horbach; Joey Pehlke; Ronja Laarmann-Quante; Yuning Ding – International Journal of Artificial Intelligence in Education, 2024
This paper investigates crosslingual content scoring, a scenario where scoring models trained on learner data in one language are applied to data in a different language. We analyze data in five different languages (Chinese, English, French, German and Spanish) collected for three prompts of the established English ASAP content scoring dataset. We…
Descriptors: Contrastive Linguistics, Scoring, Learning Analytics, Chinese
Eran Hadas; Arnon Hershkovitz – Journal of Learning Analytics, 2025
Creativity is an imperative skill for today's learners, one that has important contributions to issues of inclusion and equity in education. Therefore, assessing creativity is of major importance in educational contexts. However, scoring creativity based on traditional tools suffers from subjectivity and is heavily time- and labour-consuming. This…
Descriptors: Creativity, Evaluation Methods, Computer Assisted Testing, Artificial Intelligence
Zhang, Mengxue; Baral, Sami; Heffernan, Neil; Lan, Andrew – International Educational Data Mining Society, 2022
Automatic short answer grading is an important research direction in the exploration of how to use artificial intelligence (AI)-based tools to improve education. Current state-of-the-art approaches use neural language models to create vectorized representations of students responses, followed by classifiers to predict the score. However, these…
Descriptors: Grading, Mathematics Instruction, Artificial Intelligence, Form Classes (Languages)
Kunal Sareen – Innovations in Education and Teaching International, 2024
This study examines the proficiency of Chat GPT, an AI language model, in answering questions on the Situational Judgement Test (SJT), a widely used assessment tool for evaluating the fundamental competencies of medical graduates in the UK. A total of 252 SJT questions from the "Oxford Assess and Progress: Situational Judgement" Test…
Descriptors: Ethics, Decision Making, Artificial Intelligence, Computer Software
Kevin C. Haudek; Xiaoming Zhai – International Journal of Artificial Intelligence in Education, 2024
Argumentation, a key scientific practice presented in the "Framework for K-12 Science Education," requires students to construct and critique arguments, but timely evaluation of arguments in large-scale classrooms is challenging. Recent work has shown the potential of automated scoring systems for open response assessments, leveraging…
Descriptors: Accuracy, Persuasive Discourse, Artificial Intelligence, Learning Management Systems
Ecem Kopuz; Galip Kartal – PASAA: Journal of Language Teaching and Learning in Thailand, 2025
The developments in artificial intelligence (AI) have significantly transformed second language (L2) learning and assessment, and the role of AI technologies in L2 assessment have been investigated in recent research. This study presents a bibliosystematic analysis of AI-assisted L2 assessment. Using both systematic analysis and bibliometric…
Descriptors: Artificial Intelligence, Computer Software, Technology Integration, Feedback (Response)
Tahereh Firoozi; Okan Bulut; Mark J. Gierl – International Journal of Assessment Tools in Education, 2023
The proliferation of large language models represents a paradigm shift in the landscape of automated essay scoring (AES) systems, fundamentally elevating their accuracy and efficacy. This study presents an extensive examination of large language models, with a particular emphasis on the transformative influence of transformer-based models, such as…
Descriptors: Turkish, Writing Evaluation, Essays, Accuracy
Uto, Masaki; Okano, Masashi – IEEE Transactions on Learning Technologies, 2021
In automated essay scoring (AES), scores are automatically assigned to essays as an alternative to grading by humans. Traditional AES typically relies on handcrafted features, whereas recent studies have proposed AES models based on deep neural networks to obviate the need for feature engineering. Those AES models generally require training on a…
Descriptors: Essays, Scoring, Writing Evaluation, Item Response Theory

Peer reviewed
Direct link
