ERIC Number: EJ1476001
Record Type: Journal
Publication Date: 2025-Jun
Pages: 34
Abstractor: As Provided
ISBN: N/A
ISSN: ISSN-0022-0655
EISSN: EISSN-1745-3984
Available Date: 2025-06-01
Evaluating the Consistency and Reliability of Attribution Methods in Automated Short Answer Grading (ASAG) Systems: Toward an Explainable Scoring System
Journal of Educational Measurement, v62 n2 p248-281 2025
In recent years, the application of explainability techniques to automated essay scoring and automated short-answer grading (ASAG) models, particularly those based on transformer architectures, has gained significant attention. However, the reliability and consistency of these techniques remain underexplored. This study systematically investigates the use of attribution scores in ASAG systems, focusing on their consistency in reflecting model decisions. Specifically, we examined how attribution scores generated by different methods--namely Local Interpretable Model-agnostic Explanations (LIME), Integrated Gradients (IG), Hierarchical Explanation via Divisive Generation (HEDGE), and Leave-One-Out (LOO)--compare in their consistency and ability to illustrate the decision-making processes of transformer-based scoring systems trained on a publicly available response dataset. Additionally, we analyzed how attribution scores varied across different scoring categories in a polytomously scored response dataset and across two transformer-based scoring model architectures: Bidirectional Encoder Representations from Transformers (BERT) and Decoding-enhanced BERT with Disentangled Attention (DeBERTa-v2). Our findings highlight the challenges in evaluating explainability metrics, with important implications for both high-stakes and formative assessment contexts. This study contributes to the development of more reliable and transparent ASAG systems.
Descriptors: Automation, Grading, Computer Assisted Testing, Scoring, Evaluation Methods, Student Evaluation, Attribution Theory, Scores, Decision Making, High Stakes Tests, Formative Evaluation, Test Reliability
Wiley. Available from: John Wiley & Sons, Inc. 111 River Street, Hoboken, NJ 07030. Tel: 800-835-6770; e-mail: cs-journals@wiley.com; Web site: https://www.wiley.com/en-us
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A
Author Affiliations: 1University of Florida; 2Research and Evaluation Methodology at the University of Florida