ERIC - Search Results

Publication Date

In 2026	0
Since 2025	10
Since 2022 (last 5 years)	94
Since 2017 (last 10 years)	215
Since 2007 (last 20 years)	348

Descriptor

Computer Assisted Testing	511
Scoring	511
Test Items	111
Test Construction	102
Automation	92
Essays	82
Foreign Countries	81
Scores	79
Adaptive Testing	78
Evaluation Methods	77
Computer Software	75
Writing Evaluation	75
Comparative Analysis	72
Language Tests	70
Student Evaluation	67
Test Validity	67
Second Language Learning	66
Correlation	65
English (Second Language)	62
Test Format	59
Test Reliability	55
Models	52
Item Response Theory	51
Educational Technology	48
Artificial Intelligence	47
More ▼

Education Level

Higher Education	85
Postsecondary Education	67
Secondary Education	50
Elementary Education	45
Elementary Secondary Education	33
Middle Schools	31
Junior High Schools	25
High Schools	15
Grade 8	12
Intermediate Grades	11
Grade 4	10
Early Childhood Education	9
Grade 5	9
Grade 6	8
Grade 7	8
Grade 3	5
Primary Education	5
Adult Education	3
Grade 9	3
Preschool Education	3
Grade 2	2
Kindergarten	2
Grade 10	1
Grade 11	1
Grade 12	1
More ▼

Audience

Administrators	8
Practitioners	8
Researchers	7
Teachers	4
Students	2
Counselors	1
Policymakers	1

Location

Australia	10
China	10
New York	9
Japan	7
Netherlands	6
Canada	5
Germany	5
Iran	4
Taiwan	4
United Kingdom	4
United Kingdom (England)	4
United States	4
Europe	3
Indonesia	3
Singapore	3
South Korea	3
Spain	3
California	2
Connecticut	2
Czech Republic	2
Denmark	2
France	2
Hong Kong	2
Israel	2
Malaysia	2
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	3
Every Student Succeeds Act…	2
Elementary and Secondary…	1
Elementary and Secondary…	1
Family Educational Rights and…	1
Health Insurance Portability…	1
Individuals with Disabilities…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 511 results Save | Export

Using Linkage Sets to Improve Connectedness in Rater Response Model Estimation

Peer reviewed

Direct link

Casabianca, Jodi M.; Donoghue, John R.; Shin, Hyo Jeong; Chao, Szu-Fu; Choi, Ikkyu – Journal of Educational Measurement, 2023

Using item-response theory to model rater effects provides an alternative solution for rater monitoring and diagnosis, compared to using standard performance metrics. In order to fit such models, the ratings data must be sufficiently connected in order to estimate rater effects. Due to popular rating designs used in large-scale testing scenarios,…

Descriptors: Item Response Theory, Alternative Assessment, Evaluators, Research Problems

Automated Short Answer Scoring Using an Ensemble of Neural Networks and Latent Semantic Analysis Classifiers

Peer reviewed

Direct link

Ormerod, Christopher; Lottridge, Susan; Harris, Amy E.; Patel, Milan; van Wamelen, Paul; Kodeswaran, Balaji; Woolf, Sharon; Young, Mackenzie – International Journal of Artificial Intelligence in Education, 2023

We introduce a short answer scoring engine made up of an ensemble of deep neural networks and a Latent Semantic Analysis-based model to score short constructed responses for a large suite of questions from a national assessment program. We evaluate the performance of the engine and show that the engine achieves above-human-level performance on a…

Descriptors: Computer Assisted Testing, Scoring, Artificial Intelligence, Semantics

Automatic Essay Scoring for Discussion Forum in Online Learning Based on Semantic and Keyword Similarities

Peer reviewed

Direct link

Dhini, Bachriah Fatwa; Girsang, Abba Suganda; Sufandi, Unggul Utan; Kurniawati, Heny – Asian Association of Open Universities Journal, 2023

Purpose: The authors constructed an automatic essay scoring (AES) model in a discussion forum where the result was compared with scores given by human evaluators. This research proposes essay scoring, which is conducted through two parameters, semantic and keyword similarities, using a SentenceTransformers pre-trained model that can construct the…

Descriptors: Computer Assisted Testing, Scoring, Writing Evaluation, Essays

On the Limitations of Human-Computer Agreement in Automated Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Doewes, Afrizal; Pechenizkiy, Mykola – International Educational Data Mining Society, 2021

Scoring essays is generally an exhausting and time-consuming task for teachers. Automated Essay Scoring (AES) facilitates the scoring process to be faster and more consistent. The most logical way to assess the performance of an automated scorer is by measuring the score agreement with the human raters. However, we provide empirical evidence that…

Descriptors: Man Machine Systems, Automation, Computer Assisted Testing, Scoring

Application of an Automated Essay Scoring Engine to English Writing Assessment Using Many-Facet Rasch Measurement

Peer reviewed

Direct link

Chan, Kinnie Kin Yee; Bond, Trevor; Yan, Zi – Language Testing, 2023

We investigated the relationship between the scores assigned by an Automated Essay Scoring (AES) system, the Intelligent Essay Assessor (IEA), and grades allocated by trained, professional human raters to English essay writing by instigating two procedures novel to written-language assessment: the logistic transformation of AES raw scores into…

Descriptors: Computer Assisted Testing, Essays, Scoring, Scores

Peer reviewed

Direct link

Ramnarain-Seetohul, Vidasha; Bassoo, Vandana; Rosunally, Yasmine – Education and Information Technologies, 2022

In automated essay scoring (AES) systems, similarity techniques are used to compute the score for student answers. Several methods to compute similarity have emerged over the years. However, only a few of them have been widely used in the AES domain. This work shows the findings of a ten-year review on similarity techniques applied in AES systems…

Descriptors: Computer Assisted Testing, Essays, Scoring, Automation

Automated Scoring of Figural Tests of Creativity with Computer Vision

Peer reviewed

Direct link

Selcuk Acar; Peter Organisciak; Denis Dumas – Journal of Creative Behavior, 2025

In this three-study investigation, we applied various approaches to score drawings created in response to both Form A and Form B of the Torrance Tests of Creative Thinking-Figural (broadly TTCT-F) as well as the Multi-Trial Creative Ideation task (MTCI). We focused on TTCT-F in Study 1, and utilizing a random forest classifier, we achieved 79% and…

Descriptors: Scoring, Computer Assisted Testing, Models, Correlation

The Vulnerability of AI-Based Scoring Systems to Gaming Strategies: A Case Study

Peer reviewed

Direct link

Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025

Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…

Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy

Comparing the Effect of Contextualized versus Generic Automated Feedback on Students' Scientific Argumentation. Research Report. ETS RR-22-03

Peer reviewed
PDF on ERIC

Download full text

Olivera-Aguilar, Margarita; Lee, Hee-Sun; Pallant, Amy; Belur, Vinetha; Mulholland, Matthew; Liu, Ou Lydia – ETS Research Report Series, 2022

This study uses a computerized formative assessment system that provides automated scoring and feedback to help students write scientific arguments in a climate change curriculum. We compared the effect of contextualized versus generic automated feedback on students' explanations of scientific claims and attributions of uncertainty to those…

Descriptors: Computer Assisted Testing, Formative Evaluation, Automation, Scoring

Accuracy and Reliability of Large Language Models in Assessing Learning Outcomes Achievement across Cognitive Domains

Peer reviewed

Direct link

Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…

Descriptors: Accuracy, Reliability, Computational Linguistics, Standards

Automated Scoring of Constructed Response Items in Math Assessment Using Large Language Models

Peer reviewed

Direct link

Wesley Morris; Langdon Holmes; Joon Suh Choi; Scott Crossley – International Journal of Artificial Intelligence in Education, 2025

Recent developments in the field of artificial intelligence allow for improved performance in the automated assessment of extended response items in mathematics, potentially allowing for the scoring of these items cheaply and at scale. This study details the grand prize-winning approach to developing large language models (LLMs) to automatically…

Descriptors: Automation, Computer Assisted Testing, Mathematics Tests, Scoring

Automated Topical Component Extraction Using Neural Network Attention Scores from Source-Based Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Zhang, Haoran; Litman, Diane – Grantee Submission, 2020

While automated essay scoring (AES) can reliably grade essays at scale, automated writing evaluation (AWE) additionally provides formative feedback to guide essay revision. However, a neural AES typically does not provide useful feature representations for supporting AWE. This paper presents a method for linking AWE and neural AES, by extracting…

Descriptors: Computer Assisted Testing, Scoring, Essay Tests, Writing Evaluation

Grading Exams Using Large Language Models: A Comparison between Human and AI Grading of Exams in Higher Education Using ChatGPT

Peer reviewed

Direct link

Jonas Flodén – British Educational Research Journal, 2025

This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…

Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring

Integration of Prediction Scores from Various Automated Essay Scoring Models Using Item Response Theory

Peer reviewed

Direct link

Uto, Masaki; Aomi, Itsuki; Tsutsumi, Emiko; Ueno, Maomi – IEEE Transactions on Learning Technologies, 2023

In automated essay scoring (AES), essays are automatically graded without human raters. Many AES models based on various manually designed features or various architectures of deep neural networks (DNNs) have been proposed over the past few decades. Each AES model has unique advantages and characteristics. Therefore, rather than using a single-AES…

Descriptors: Prediction, Scores, Computer Assisted Testing, Scoring

The Language of Creativity: Evidence from Humans and Large Language Models

Peer reviewed

Direct link

William Orwig; Emma R. Edenbaum; Joshua D. Greene; Daniel L. Schacter – Journal of Creative Behavior, 2024

Recent developments in computerized scoring via semantic distance have provided automated assessments of verbal creativity. Here, we extend past work, applying computational linguistic approaches to characterize salient features of creative text. We hypothesize that, in addition to semantic diversity, the degree to which a story includes…

Descriptors: Computer Assisted Testing, Scoring, Creativity, Computational Linguistics

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 35

ETS Research Report Series	31
Grantee Submission	25
ProQuest LLC	16
Journal of Educational…	14
Language Testing	11
Educational Measurement:…	10
Assessing Writing	9
International Educational…	9
Journal of Technology,…	9
Applied Measurement in…	8
Educational and Psychological…	8
International Journal of…	8
Journal of Applied Testing…	8
Language Assessment Quarterly	8
New York State Education…	7
Applied Psychological…	6
Computers & Education	6
Educational Technology &…	6
Education and Information…	5
Educational Testing Service	5
International Journal of…	5
Journal of Speech, Language,…	5
Educational Assessment	4
IEEE Transactions on Learning…	4
Journal of Creative Behavior	4
More ▼

Bennett, Randy Elliot	11
Attali, Yigal	9
Anderson, Paul S.	7
Williamson, David M.	6
Bejar, Isaac I.	5
Ramineni, Chaitanya	5
Stocking, Martha L.	5
Xi, Xiaoming	5
Zechner, Klaus	5
Bridgeman, Brent	4
Davey, Tim	4
Evanini, Keelan	4
Higgins, Derrick	4
Lee, Hee-Sun	4
Liu, Ou Lydia	4
McNamara, Danielle S.	4
Mulholland, Matthew	4
O'Neil, Harold F., Jr.	4
Pallant, Amy	4
Rupp, André A.	4
Weiss, David J.	4
Wilson, Joshua	4
Alonzo, Julie	3
Breyer, F. Jay	3
More ▼

Journal Articles	330
Reports - Research	273
Reports - Evaluative	95
Reports - Descriptive	75
Speeches/Meeting Papers	57
Tests/Questionnaires	22
Dissertations/Theses -…	16
Information Analyses	16
Numerical/Quantitative Data	15
Books	12
Guides - Non-Classroom	11
Collected Works - General	10
Opinion Papers	7
Collected Works - Proceedings	5
Book/Product Reviews	4
Guides - General	4
Guides - Classroom - Teacher	2
Non-Print Media	2
Reports - General	2
Reference Materials -…	1
More ▼

Test of English as a Foreign…	29
Graduate Record Examinations	16
National Assessment of…	10
Wechsler Intelligence Scale…	4
ACTFL Oral Proficiency…	2
Advanced Placement…	2
Armed Services Vocational…	2
Dynamic Indicators of Basic…	2
International English…	2
New York State Regents…	2
Praxis Series	2
Program for International…	2
Progress in International…	2
SAT (College Admission Test)	2
Torrance Tests of Creative…	2
Trends in International…	2
Wechsler Individual…	2
ACT Assessment	1
Behavior Assessment System…	1
Center for Epidemiologic…	1
Computer Attitude Scale	1
Conners Rating Scales	1
Expressive One Word Picture…	1
Flesch Kincaid Grade Level…	1
Graduate Management Admission…	1
More ▼