ERIC Number: EJ1478260
Record Type: Journal
Publication Date: 2025
Pages: 19
Abstractor: As Provided
ISBN: N/A
ISSN: N/A
EISSN: EISSN-2469-9896
Available Date: 0000-00-00
Performance of ChatGPT on Tasks Involving Physics Visual Representations: The Case of the Brief Electricity and Magnetism Assessment
Physical Review Physics Education Research, v21 n1 Article 010154 2025
[This paper is part of the Focused Collection in Artificial Intelligence Tools in Physics Teaching and Physics Education Research.] Artificial intelligence-based chatbots are increasingly influencing physics education because of their ability to interpret and respond to textual and visual inputs. This study evaluates the performance of two large multimodal model-based chatbots, ChatGPT-4 and ChatGPT-4o, on the brief electricity and magnetism assessment (BEMA), a conceptual physics inventory rich in visual representations such as vector fields, circuit diagrams, and graphs. Quantitative analysis shows that ChatGPT-4o outperforms both ChatGPT-4 and a large sample of university students, and demonstrates improvements in ChatGPT-4o's vision interpretation ability over its predecessor ChatGPT-4. However, qualitative analysis of ChatGPT-4o's responses reveals persistent challenges. We identified three types of difficulties in the chatbot's responses to tasks on BEMA: (i) difficulties with visual interpretation, (ii) difficulties in providing correct physics laws or rules, and (iii) difficulties with spatial coordination and application of physics representations. Spatial reasoning tasks, particularly those requiring the use of the right-hand rule, proved especially problematic. These findings highlight that the most broadly used large multimodal model-based chatbot, ChatGPT-4o, still exhibits significant difficulties in engaging with physics tasks involving visual representations. While the chatbot shows potential for educational applications, including personalized tutoring and accessibility support for students who are blind or have low vision, its limitations necessitate caution. On the other hand, our findings can also be leveraged to design assessments that are difficult for chatbots to solve.
Descriptors: Artificial Intelligence, Computer Software, Technology Integration, Physics, Science Instruction, Teaching Methods, Energy, Magnets, Scientific Concepts, Science Tests, College Students, Visual Aids, Graphs, Barriers, Task Analysis, Difficulty Level, Error Correction, Scientific Principles, Spatial Ability, Visual Impairments, Vision, Access to Education, Students with Disabilities
American Physical Society. One Physics Ellipse 4th Floor, College Park, MD 20740-3844. Tel: 301-209-3200; Fax: 301-209-0865; e-mail: assocpub@aps.org; Web site: https://journals.aps.org/prper/
Publication Type: Journal Articles; Reports - Research
Education Level: Higher Education; Postsecondary Education
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A
Author Affiliations: N/A