ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	6

Source

Grantee Submission

Author

Olney, Andrew M.	2
Alice Ng	1
Andrew M. Olney	1
Denis Dumas	1
Kelly Berthiaume	1
Kizilcec, Rene	1
Kyle Lo	1
Lang, David	1
Li Lucy	1
Luca Soldaini	1
Maass, Jaclyn K.	1
Neil T. Heffernan	1
Pavlik, Philip I., Jr.	1
Peter Organisciak	1
Ryan Knight	1
Sami Baral	1
Selcuk Acar	1
Stenhaug, Ben	1
More ▼

Publication Type

Reports - Research	5
Speeches/Meeting Papers	5
Reports - Evaluative	1

Education Level

Elementary Secondary Education	1
Higher Education	1
Postsecondary Education	1

Audience

Location

Africa	1
Canada	1
Kenya	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 6 results Save | Export

Generating Multiple Choice Questions with a Multi-Angle Question Answering Model

Peer reviewed
PDF on ERIC

Download full text

Direct link

Olney, Andrew M. – Grantee Submission, 2022

Multi-angle question answering models have recently been proposed that promise to perform related tasks like question generation. However, performance on related tasks has not been thoroughly studied. We investigate a leading model called Macaw on the task of multiple choice question generation and evaluate its performance on three angles that…

Descriptors: Test Construction, Multiple Choice Tests, Test Items, Models

Generating Multiple Choice Questions from a Textbook: LLMs Match Human Performance on Most Metrics

Peer reviewed
PDF on ERIC

Download full text

Andrew M. Olney – Grantee Submission, 2023

Multiple choice questions are traditionally expensive to produce. Recent advances in large language models (LLMs) have led to fine-tuned LLMs that generate questions competitive with human-authored questions. However, the relative capabilities of ChatGPT-family models have not yet been established for this task. We present a carefully-controlled…

Descriptors: Test Construction, Multiple Choice Tests, Test Items, Algorithms

Beyond Semantic Distance: Automated Scoring of Divergent Thinking Greatly Improves with Large Language Models

Peer reviewed
PDF on ERIC

Download full text

Direct link

Peter Organisciak; Selcuk Acar; Denis Dumas; Kelly Berthiaume – Grantee Submission, 2023

Automated scoring for divergent thinking (DT) seeks to overcome a key obstacle to creativity measurement: the effort, cost, and reliability of scoring open-ended tests. For a common test of DT, the Alternate Uses Task (AUT), the primary automated approach casts the problem as a semantic distance between a prompt and the resulting idea in a text…

Descriptors: Automation, Computer Assisted Testing, Scoring, Creative Thinking

DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images

Peer reviewed

Sami Baral; Li Lucy; Ryan Knight; Alice Ng; Luca Soldaini; Neil T. Heffernan; Kyle Lo – Grantee Submission, 2024

In real-world settings, vision language models (VLMs) should robustly handle naturalistic, noisy visual content as well as domain-specific language and concepts. For example, K-12 educators using digital learning platforms may need to examine and provide feedback across many images of students' math work. To assess the potential of VLMs to support…

Descriptors: Visual Learning, Visual Perception, Natural Language Processing, Freehand Drawing

Keystrokes, Edit Distance, and Grading Rules: Psychometric Properties of Short Answer Items

Peer reviewed
PDF on ERIC

Download full text

Lang, David; Stenhaug, Ben; Kizilcec, Rene – Grantee Submission, 2019

This research evaluates the psychometric properties of short-answer response items under a variety of grading rules in the context of a mobile learning platform in Africa. This work has three main findings. First, we introduce the concept of a differential device function (DDF), a type of differential item function that stems from the device a…

Descriptors: Foreign Countries, Psychometrics, Test Items, Test Format

Improving Reading Comprehension with Automatically Generated Cloze Item Practice

Peer reviewed
PDF on ERIC

Download full text

Olney, Andrew M.; Pavlik, Philip I., Jr.; Maass, Jaclyn K. – Grantee Submission, 2017

This study investigated the effect of cloze item practice on reading comprehension, where cloze items were either created by humans, by machine using natural language processing techniques, or randomly. Participants from Amazon Mechanical Turk (N = 302) took a pre-test, read a text, and took part in one of five conditions, Do-Nothing, Re-Read,…

Descriptors: Reading Improvement, Reading Comprehension, Prior Learning, Cloze Procedure

Natural Language Processing	6
Test Items	6
Models	3
Artificial Intelligence	2
Computer Assisted Testing	2
Foreign Countries	2
Multiple Choice Tests	2
Test Construction	2
Textbooks	2
Algorithms	1
Alternative Assessment	1
Anatomy	1
Automation	1
Cloze Procedure	1
College Science	1
Computer Mediated…	1
Creative Thinking	1
Creativity Tests	1
Documentation	1
Educational Technology	1
Effect Size	1
Electronic Learning	1
Elementary Secondary Education	1
Freehand Drawing	1
Grading	1
More ▼