ERIC Number: EJ1467772
Record Type: Journal
Publication Date: 2025
Pages: 10
Abstractor: As Provided
ISBN: N/A
ISSN: ISSN-1043-4046
EISSN: EISSN-1522-1229
Available Date: 0000-00-00
Using Aggregated AI Detector Outcomes to Eliminate False Positives in STEM-Student Writing
Jon-Philippe K. Hyatt; Elisa Jayne Bienenstock; Carla M. Firetto; Elizabeth R. Woods; Robert C. Comus
Advances in Physiology Education, v49 n2 p486-495 2025
Generative artificial intelligence (AI) large language models have become sufficiently accessible and user-friendly to assist students with course work, studying tactics, and written communication. AI-generated writing is almost indistinguishable from human-derived work. Instructors must rely on intuition/experience and, recently, assistance from online AI detectors to help them distinguish between student- and AI-written material. Here, we tested the veracity of AI detectors for writing samples from a fact-heavy, lower-division undergraduate anatomy and physiology course. Student participants (n = 190) completed three parts: a hand-written essay answering a prompt on the structure/function of the plasma membrane; creating an AI-generated answer to the same prompt; and a survey seeking participants' views on the quality of each essay as well as general AI use. Randomly selected (n = 50) participant-written and AI-generated essays were blindly uploaded onto four AI detectors; a separate and unique group of randomly selected essays (n = 48) was provided to human raters (n = 9) for classification assessment. For the majority of essays, human raters and the best-performing AI detectors (n = 3) similarly identified their correct origin (84-95% and 93--98%, respectively) (P > 0.05). Approximately 1.3% and 5.0% of the essays were detected as false positives (human writing incorrectly labeled as AI) by AI detectors and human raters, respectively. Surveys generally indicated that students viewed the AI-generated work as better than their own (P < 0.01). Using AI detectors in aggregate reduced the likelihood of detecting a false positive to nearly 0%, and this strategy was validated against human rater-labeled false positives. Taken together, our findings show that AI detectors, when used together, become a powerful tool to inform instructors.
Descriptors: Artificial Intelligence, Technology Uses in Education, STEM Education, Writing Skills, Investigations, Identification, Human Factors Engineering, Undergraduate Students, Anatomy, Physiology
American Physiological Society. 9650 Rockville Pike, Bethesda, MD 20814-3991. Tel: 301-634-7164; Fax: 301-634-7241; e-mail: webmaster@the-aps.org; Web site: https://www.physiology.org/journal/advances
Publication Type: Journal Articles; Reports - Research
Education Level: Higher Education; Postsecondary Education
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A
Author Affiliations: N/A