ERIC - Search Results

Publication Date

In 2025	9
Since 2024	22

Source

Discover Education	2
International Journal of…	2
Journal of Educational…	2
ProQuest LLC	2
Educational and Psychological…	1
Grantee Submission	1
HAPS Educator	1
Innovations in Education and…	1
International Journal of…	1
International Journal of…	1
Journal of Education and…	1
Journal of Educational and…	1
Journal of Learning Analytics	1
Journal of Pedagogical…	1
Language Testing in Asia	1
Phi Delta Kappan	1
Teaching of Psychology	1
Vocabulary Learning and…	1
More ▼

Publication Type

Journal Articles	19
Reports - Research	14
Reports - Descriptive	3
Dissertations/Theses -…	2
Information Analyses	2
Reports - Evaluative	1

Education Level

Higher Education	9
Postsecondary Education	9
Elementary Education	1
High Schools	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1
Two Year Colleges	1

Audience

Location

South Africa	2
Africa	1
China	1
Ghana	1
Japan	1
Nigeria	1
Oman	1
United Kingdom	1

Laws, Policies, & Programs

Assessments and Surveys

Torrance Tests of Creative…

What Works Clearinghouse Rating

Showing 1 to 15 of 22 results Save | Export

Review on Neural Question Generation for Education Purposes

Peer reviewed

Direct link

Said Al Faraby; Adiwijaya Adiwijaya; Ade Romadhony – International Journal of Artificial Intelligence in Education, 2024

Questioning plays a vital role in education, directing knowledge construction and assessing students' understanding. However, creating high-level questions requires significant creativity and effort. Automatic question generation is expected to facilitate the generation of not only fluent and relevant but also educationally valuable questions.…

Descriptors: Test Items, Automation, Computer Software, Input Output Analysis

A Generalized Objective Function for Computer Adaptive Item Selection

Peer reviewed

Direct link

Harold Doran; Testsuhiro Yamada; Ted Diaz; Emre Gonulates; Vanessa Culver – Journal of Educational Measurement, 2025

Computer adaptive testing (CAT) is an increasingly common mode of test administration offering improved test security, better measurement precision, and the potential for shorter testing experiences. This article presents a new item selection algorithm based on a generalized objective function to support multiple types of testing conditions and…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Algorithms

Modeling Directional Testlet Effects on Multiple Open-Ended Questions

Peer reviewed

Direct link

Kuan-Yu Jin; Wai-Lok Siu – Journal of Educational Measurement, 2025

Educational tests often have a cluster of items linked by a common stimulus ("testlet"). In such a design, the dependencies caused between items are called "testlet effects." In particular, the directional testlet effect (DTE) refers to a recursive influence whereby responses to earlier items can positively or negatively affect…

Descriptors: Models, Test Items, Educational Assessment, Scores

Cognitive Diagnosis Testlet Model for Multiple-Choice Items

Peer reviewed

Direct link

Lei Guo; Wenjie Zhou; Xiao Li – Journal of Educational and Behavioral Statistics, 2024

The testlet design is very popular in educational and psychological assessments. This article proposes a new cognitive diagnosis model, the multiple-choice cognitive diagnostic testlet (MC-CDT) model for tests using testlets consisting of MC items. The MC-CDT model uses the original examinees' responses to MC items instead of dichotomously scored…

Descriptors: Multiple Choice Tests, Diagnostic Tests, Accuracy, Computer Software

Investigating Heterogeneity in Response Strategies: A Mixture Multidimensional IRTree Approach

Peer reviewed

Direct link

Ö. Emre C. Alagöz; Thorsten Meiser – Educational and Psychological Measurement, 2024

To improve the validity of self-report measures, researchers should control for response style (RS) effects, which can be achieved with IRTree models. A traditional IRTree model considers a response as a combination of distinct decision-making processes, where the substantive trait affects the decision on response direction, while decisions about…

Descriptors: Item Response Theory, Validity, Self Evaluation (Individuals), Decision Making

Assessment of Large Language Models' Performances and Hallucinations for Chinese Postgraduate Medical Entrance Examination

Peer reviewed

Direct link

Hongfei Ye; Jian Xu; Danqing Huang; Meng Xie; Jinming Guo; Junrui Yang; Haiwei Bao; Mingzhi Zhang; Ce Zheng – Discover Education, 2025

This study evaluates Large language models (LLMs)' performance on Chinese Postgraduate Medical Entrance Examination (CPGMEE) as well as the hallucinations produced by LLMs and investigate their implications for medical education. We curated 10 trials of mock CPGMEE to evaluate the performances of 4 LLMs (GPT-4.0, ChatGPT, QWen 2.1 and Ernie 4.0).…

Descriptors: College Entrance Examinations, Foreign Countries, Computational Linguistics, Graduate Medical Education

Collaborate with Generative AI to Improve Classroom Assessments

Direct link

Bryan R. Drost; Char Shryock – Phi Delta Kappan, 2025

Creating assessment questions aligned to standards is a time-consuming task for teachers, but large language models such as ChatGPT can help. Bryan Drost & Char Shryock describe a three-step process for using ChatGPT to create assessments: 1) Ask ChatGPT to break standards into measurable targets. 2) Determine how much time to spend on each…

Descriptors: Artificial Intelligence, Computer Software, Technology Integration, Teaching Methods

Applying Combinatorics in the Design of Multiversion Exams

Peer reviewed

Direct link

Jila Niknejad; Margaret Bayer – International Journal of Mathematical Education in Science and Technology, 2025

In Spring 2020, the need for redesigning online assessments to preserve integrity became a priority to many educators. Many of us found methods to proctor examinations using Zoom and proctoring software. Such examinations pose their own issues. To reduce the technical difficulties and cost, many Zoom proctored examination sessions were shortened;…

Descriptors: Mathematics Instruction, Mathematics Tests, Computer Assisted Testing, Computer Software

The Feasibility of Computerized Adaptive Testing of the National Benchmark Test: A Simulation Study

Peer reviewed
PDF on ERIC

Download full text

Musa Adekunle Ayanwale; Mdutshekelwa Ndlovu – Journal of Pedagogical Research, 2024

The COVID-19 pandemic has had a significant impact on high-stakes testing, including the national benchmark tests in South Africa. Current linear testing formats have been criticized for their limitations, leading to a shift towards Computerized Adaptive Testing [CAT]. Assessments with CAT are more precise and take less time. Evaluation of CAT…

Descriptors: Adaptive Testing, Benchmarking, National Competency Tests, Computer Assisted Testing

Learning to Love LLMs for Answer Interpretation: Chain-of-Thought Prompting and the AMMORE Dataset

Peer reviewed
PDF on ERIC

Download full text

Owen Henkel; Hannah Horne-Robinson; Maria Dyshel; Greg Thompson; Ralph Abboud; Nabil Al Nahin Ch; Baptiste Moreau-Pernet; Kirk Vanacore – Journal of Learning Analytics, 2025

This paper introduces AMMORE, a new dataset of 53,000 math open-response question-answer pairs from Rori, a mathematics learning platform used by middle and high school students in several African countries. Using this dataset, we conducted two experiments to evaluate the use of large language models (LLM) for grading particularly challenging…

Descriptors: Learning Analytics, Learning Management Systems, Mathematics Instruction, Middle School Students

Assessing the Ethical Capabilities of Chat GPT in Healthcare: A Study on Its Proficiency in Situational Judgement Test

Peer reviewed

Direct link

Kunal Sareen – Innovations in Education and Teaching International, 2024

This study examines the proficiency of Chat GPT, an AI language model, in answering questions on the Situational Judgement Test (SJT), a widely used assessment tool for evaluating the fundamental competencies of medical graduates in the UK. A total of 252 SJT questions from the "Oxford Assess and Progress: Situational Judgement" Test…

Descriptors: Ethics, Decision Making, Artificial Intelligence, Computer Software

A Comparative Study of AI-Human-Made and Human-Made Test Forms for a University TESOL Theory Course

Peer reviewed

Direct link

Kyung-Mi O. – Language Testing in Asia, 2024

This study examines the efficacy of artificial intelligence (AI) in creating parallel test items compared to human-made ones. Two test forms were developed: one consisting of 20 existing human-made items and another with 20 new items generated with ChatGPT assistance. Expert reviews confirmed the content parallelism of the two test forms.…

Descriptors: Comparative Analysis, Artificial Intelligence, Computer Software, Test Items

Content and Item Response Theory Analysis of ChatGPT-4-Generated Multiple-Choice Items

Peer reviewed

Direct link

Roger Young; Emily Courtney; Alexander Kah; Mariah Wilkerson; Yi-Hsin Chen – Teaching of Psychology, 2025

Background: Multiple-choice item (MCI) assessments are burdensome for instructors to develop. Artificial intelligence (AI, e.g., ChatGPT) can streamline the process without sacrificing quality. The quality of AI-generated MCIs and human experts is comparable. However, whether the quality of AI-generated MCIs is equally good across various domain-…

Descriptors: Item Response Theory, Multiple Choice Tests, Psychology, Textbooks

Answer Changing Behaviors and Performance in a First-Year Medical Gross and Developmental Anatomy Course

Peer reviewed
PDF on ERIC

Download full text

Marli Crabtree; Kenneth L. Thompson; Ellen M. Robertson – HAPS Educator, 2024

Research has suggested that changing one's answer on multiple-choice examinations is more likely to lead to positive academic outcomes. This study aimed to further understand the relationship between changing answer selections and item attributes, student performance, and time within a population of 158 first-year medical students enrolled in a…

Descriptors: Anatomy, Science Tests, Medical Students, Medical Education

Evaluating the Effectiveness of a Computerized Achievement Test Using Learn Smart for Psychometric Assessment under Item Response Theory

Peer reviewed
PDF on ERIC

Download full text

Mimi Ismail; Ahmed Al - Badri; Said Al - Senaidi – Journal of Education and e-Learning Research, 2025

This study aimed to reveal the differences in individuals' abilities, their standard errors, and the psychometric properties of the test according to the two methods of applying the test (electronic and paper). The descriptive approach was used to achieve the study's objectives. The study sample consisted of 74 male and female students at the…

Descriptors: Achievement Tests, Computer Assisted Testing, Psychometrics, Item Response Theory

Previous Page | Next Page »

Pages: 1 | 2

Computer Software	22
Test Items	22
Artificial Intelligence	13
Computer Assisted Testing	11
Computational Linguistics	10
Test Construction	10
Accuracy	8
Item Analysis	7
Multiple Choice Tests	7
Foreign Countries	6
Comparative Analysis	5
Item Response Theory	5
Scores	5
Technology Uses in Education	4
Decision Making	3
Difficulty Level	3
English (Second Language)	3
Learning Analytics	3
Mathematics Instruction	3
Mathematics Tests	3
Medical Education	3
Psychometrics	3
Second Language Instruction	3
Second Language Learning	3
Teaching Methods	3
More ▼

Ade Romadhony	1
Adiwijaya Adiwijaya	1
Ahmed Al - Badri	1
Alexander Kah	1
Ayaka Sugawara	1
Baptiste Moreau-Pernet	1
Blake Edward Morris	1
Bryan R. Drost	1
Ce Zheng	1
Char Shryock	1
Danqing Huang	1
Denis Dumas	1
Ellen M. Robertson	1
Emily Courtney	1
Emre Gonulates	1
Fazilet Gül Ince Araci	1
Ghaith Assi	1
Greg Thompson	1
Gülsah Gürkan	1
Haiwei Bao	1
Hannah Horne-Robinson	1
Harold Doran	1
Harun Bayer	1
Hongfei Ye	1
Jian Xu	1
More ▼