ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	7
Since 2017 (last 10 years)	9
Since 2007 (last 20 years)	10

Descriptor

Models	11
Natural Language Processing	11
Test Items	11
Artificial Intelligence	7
Test Construction	6
Automation	5
Multiple Choice Tests	4
Prediction	4
Bayesian Statistics	2
Classification	2
Coding	2
Computer Software	2
Difficulty Level	2
Learning Processes	2
Regression (Statistics)	2
Science Tests	2
Student Evaluation	2
Taxonomy	2
Textbooks	2
Academic Achievement	1
Accuracy	1
Algorithms	1
Alternative Assessment	1
Anatomy	1
Certification	1
More ▼

Source

Grantee Submission	3
ETS Research Report Series	1
Education and Information…	1
IEEE Transactions on Learning…	1
International Educational…	1
International Journal of…	1
International Working Group…	1
Journal of Applied Testing…	1
Journal of Computer Assisted…	1

Publication Type

Reports - Research	9
Journal Articles	6
Speeches/Meeting Papers	3
Collected Works - Proceedings	1
Information Analyses	1
Numerical/Quantitative Data	1

Education Level

Higher Education	4
Postsecondary Education	2
Elementary Education	1
Secondary Education	1

Audience

Location

Germany	1
Netherlands	1

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations

What Works Clearinghouse Rating

Showing all 11 results Save | Export

A Method for Generating Course Test Questions Based on Natural Language Processing and Deep Learning

Peer reviewed

Direct link

Hei-Chia Wang; Yu-Hung Chiang; I-Fan Chen – Education and Information Technologies, 2024

Assessment is viewed as an important means to understand learners' performance in the learning process. A good assessment method is based on high-quality examination questions. However, generating high-quality examination questions manually by teachers is a time-consuming task, and it is not easy for students to obtain question banks. To solve…

Descriptors: Natural Language Processing, Test Construction, Test Items, Models

Text-Based Question Difficulty Prediction: A Systematic Review of Automatic Approaches

Peer reviewed

Direct link

Samah AlKhuzaey; Floriana Grasso; Terry R. Payne; Valentina Tamma – International Journal of Artificial Intelligence in Education, 2024

Designing and constructing pedagogical tests that contain items (i.e. questions) which measure various types of skills for different levels of students equitably is a challenging task. Teachers and item writers alike need to ensure that the quality of assessment materials is consistent, if student evaluations are to be objective and effective.…

Descriptors: Test Items, Test Construction, Difficulty Level, Prediction

Generating Multiple Choice Questions with a Multi-Angle Question Answering Model

Peer reviewed
PDF on ERIC

Download full text

Direct link

Olney, Andrew M. – Grantee Submission, 2022

Multi-angle question answering models have recently been proposed that promise to perform related tasks like question generation. However, performance on related tasks has not been thoroughly studied. We investigate a leading model called Macaw on the task of multiple choice question generation and evaluate its performance on three angles that…

Descriptors: Test Construction, Multiple Choice Tests, Test Items, Models

Automatic Short Answer Grading with SBERT on Out-of-Sample Questions

Peer reviewed
PDF on ERIC

Download full text

Condor, Aubrey; Litster, Max; Pardos, Zachary – International Educational Data Mining Society, 2021

We explore how different components of an Automatic Short Answer Grading (ASAG) model affect the model's ability to generalize to questions outside of those used for training. For supervised automatic grading models, human ratings are primarily used as ground truth labels. Producing such ratings can be resource heavy, as subject matter experts…

Descriptors: Automation, Grading, Test Items, Generalization

Generating Multiple Choice Questions from a Textbook: LLMs Match Human Performance on Most Metrics

Peer reviewed
PDF on ERIC

Download full text

Andrew M. Olney – Grantee Submission, 2023

Multiple choice questions are traditionally expensive to produce. Recent advances in large language models (LLMs) have led to fine-tuned LLMs that generate questions competitive with human-authored questions. However, the relative capabilities of ChatGPT-family models have not yet been established for this task. We present a carefully-controlled…

Descriptors: Test Construction, Multiple Choice Tests, Test Items, Algorithms

Using Machine Learning to Predict Bloom's Taxonomy Level for Certification Exam Items

Peer reviewed

Direct link

Mead, Alan D.; Zhou, Chenxuan – Journal of Applied Testing Technology, 2022

This study fit a Naïve Bayesian classifier to the words of exam items to predict the Bloom's taxonomy level of the items. We addressed five research questions, showing that reasonably good prediction of Bloom's level was possible, but accuracy varies across levels. In our study, performance for Level 2 was poor (Level 2 items were misclassified…

Descriptors: Artificial Intelligence, Prediction, Taxonomy, Natural Language Processing

Beyond Semantic Distance: Automated Scoring of Divergent Thinking Greatly Improves with Large Language Models

Peer reviewed
PDF on ERIC

Download full text

Direct link

Peter Organisciak; Selcuk Acar; Denis Dumas; Kelly Berthiaume – Grantee Submission, 2023

Automated scoring for divergent thinking (DT) seeks to overcome a key obstacle to creativity measurement: the effort, cost, and reliability of scoring open-ended tests. For a common test of DT, the Alternate Uses Task (AUT), the primary automated approach casts the problem as a semantic distance between a prompt and the resulting idea in a text…

Descriptors: Automation, Computer Assisted Testing, Scoring, Creative Thinking

Coding Energy Knowledge in Constructed Responses with Explainable NLP Models

Peer reviewed

Direct link

Gombert, Sebastian; Di Mitri, Daniele; Karademir, Onur; Kubsch, Marcus; Kolbe, Hannah; Tautz, Simon; Grimm, Adrian; Bohm, Isabell; Neumann, Knut; Drachsler, Hendrik – Journal of Computer Assisted Learning, 2023

Background: Formative assessments are needed to enable monitoring how student knowledge develops throughout a unit. Constructed response items which require learners to formulate their own free-text responses are well suited for testing their active knowledge. However, assessing such constructed responses in an automated fashion is a complex task…

Descriptors: Coding, Energy, Scientific Concepts, Formative Evaluation

Automatic Multiple Choice Question Generation From Text: A Survey

Peer reviewed

Direct link

Rao, Dhawaleswar; Saha, Sujan Kumar – IEEE Transactions on Learning Technologies, 2020

Automatic multiple choice question (MCQ) generation from a text is a popular research area. MCQs are widely accepted for large-scale assessment in various domains and applications. However, manual generation of MCQs is expensive and time-consuming. Therefore, researchers have been attracted toward automatic MCQ generation since the late 90's.…

Descriptors: Multiple Choice Tests, Test Construction, Automation, Computer Software

Inside Sourcefinder: Predicting the Acceptability Status of Candidate Reading-Comprehension Source Documents. Research Report. ETS RR-06-24

Peer reviewed
PDF on ERIC

Download full text

Sheehan, Kathleen M.; Kostin, Irene; Futagi, Yoko; Hemat, Ramin; Zuckerman, Daniel – ETS Research Report Series, 2006

This paper describes the development, implementation, and evaluation of an automated system for predicting the acceptability status of candidate reading-comprehension stimuli extracted from a database of journal and magazine articles. The system uses a combination of classification and regression techniques to predict the probability that a given…

Descriptors: Automation, Prediction, Reading Comprehension, Classification

Proceedings of the International Conference on Educational Data Mining (EDM) (4th, Eindhoven, the Netherlands, July 6-8, 2011)

Download full text

Pechenizkiy, Mykola; Calders, Toon; Conati, Cristina; Ventura, Sebastian; Romero, Cristobal; Stamper, John – International Working Group on Educational Data Mining, 2011

The 4th International Conference on Educational Data Mining (EDM 2011) brings together researchers from computer science, education, psychology, psychometrics, and statistics to analyze large datasets to answer educational research questions. The conference, held in Eindhoven, The Netherlands, July 6-9, 2011, follows the three previous editions…

Descriptors: Academic Achievement, Logical Thinking, Profiles, Tutoring

Andrew M. Olney	1
Bohm, Isabell	1
Calders, Toon	1
Conati, Cristina	1
Condor, Aubrey	1
Denis Dumas	1
Di Mitri, Daniele	1
Drachsler, Hendrik	1
Floriana Grasso	1
Futagi, Yoko	1
Gombert, Sebastian	1
Grimm, Adrian	1
Hei-Chia Wang	1
Hemat, Ramin	1
I-Fan Chen	1
Karademir, Onur	1
Kelly Berthiaume	1
Kolbe, Hannah	1
Kostin, Irene	1
Kubsch, Marcus	1
Litster, Max	1
Mead, Alan D.	1
Neumann, Knut	1
Olney, Andrew M.	1
Pardos, Zachary	1
More ▼