ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	2

Descriptor

Decision Making	2
Models	2
Task Analysis	2
Test Construction	2
Artificial Intelligence	1
Bayesian Statistics	1
Best Practices	1
Clinical Diagnosis	1
Comparative Analysis	1
Computer Simulation	1
Computer Software	1
Dialogs (Language)	1
Efficiency	1
Evaluation Methods	1
Evidence Based Practice	1
Flow Charts	1
Intelligent Tutoring Systems	1
Internal Medicine	1
Interrater Reliability	1
Item Analysis	1
Multiple Choice Tests	1
Reliability	1
Scores	1
Teacher Student Relationship	1
Teaching Methods	1
More ▼

Source

AERA Online Paper Repository	1
International Educational…	1

Author

Cook, Robert J.	1
Durning, Steven J.	1
Piech, Chris	1
Tack, Anaïs	1

Publication Type

Reports - Research	2
Speeches/Meeting Papers	2

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 2 results Save | Export

The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues

Peer reviewed
PDF on ERIC

Download full text

Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022

How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…

Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making

Process Modeling: A Structured Approach to Assessing Complex Decision Making

Peer reviewed

Direct link

Cook, Robert J.; Durning, Steven J. – AERA Online Paper Repository, 2016

In an effort to better align item development to goals of assessing higher-order tasks and decision making, complex decision trees were developed to follow clinical reasoning scripts and used as models on which multiple-choice questions could be built. This approach is compatible with best-practice assessment frameworks like Evidence Centered…

Descriptors: Multiple Choice Tests, Decision Making, Models, Task Analysis