ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	1

Descriptor

Algorithms	9
Models	9
Test Construction	9
Computer Assisted Testing	4
Test Items	4
Adaptive Testing	2
Educational Assessment	2
Foreign Countries	2
Item Banks	2
Item Response Theory	2
Multiple Choice Tests	2
Scoring	2
Selection	2
Simulation	2
Ability	1
Achievement Tests	1
Anatomy	1
Artificial Intelligence	1
Classification	1
Coding	1
College Science	1
College Students	1
Decision Making	1
Difficulty Level	1
Educational Research	1
More ▼

Source

Applied Psychological…	1
CALICO Journal	1
Grantee Submission	1
Psychological Review	1

Author

Andrew M. Olney	1
Baker, Sheldon R.	1
Berger, Martijn P. F.	1
Burston, Jack	1
Falmagne, Jean-Claude	1
Lewis, Charles	1
Longford, Nicholas T.	1
Monville-Burston, Monique	1
Sanders, Piet F.	1
Stocking, Martha	1
Veerkamp, Wim J. J.	1
Verschoor, Alfred J.	1
Yan, Duanli	1
van der Linden, Wim J., Ed.	1
More ▼

Publication Type

Reports - Evaluative	5
Journal Articles	3
Speeches/Meeting Papers	3
Reports - Descriptive	2
Reports - Research	2

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Location

Australia

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…

What Works Clearinghouse Rating

Showing all 9 results Save | Export

Generating Multiple Choice Questions from a Textbook: LLMs Match Human Performance on Most Metrics

Peer reviewed
PDF on ERIC

Download full text

Andrew M. Olney – Grantee Submission, 2023

Multiple choice questions are traditionally expensive to produce. Recent advances in large language models (LLMs) have led to fine-tuned LLMs that generate questions competitive with human-authored questions. However, the relative capabilities of ChatGPT-family models have not yet been established for this task. We present a carefully-controlled…

Descriptors: Test Construction, Multiple Choice Tests, Test Items, Algorithms

Parallel Test Construction Using Classical Item Parameters.

Peer reviewed

Sanders, Piet F.; Verschoor, Alfred J. – Applied Psychological Measurement, 1998

Presents minimization and maximization models for parallel test construction under constraints. The minimization model constructs weakly and strongly parallel tests of minimum length, while the maximization model constructs weakly and strongly parallel tests with maximum test reliability. (Author/SLD)

Descriptors: Algorithms, Models, Reliability, Test Construction

Models for Scoring Missing Responses to Multiple-Choice Items. Program Statistics Research Technical Report No. 94-1.

Download full text

Longford, Nicholas T. – 1994

This study is a critical evaluation of the roles for coding and scoring of missing responses to multiple-choice items in educational tests. The focus is on tests in which the test-takers have little or no motivation; in such tests omitting and not reaching (as classified by the currently adopted operational rules) is quite frequent. Data from the…

Descriptors: Algorithms, Classification, Coding, Models

Adaptive Testing without IRT.

Download full text

Yan, Duanli; Lewis, Charles; Stocking, Martha – 1998

It is unrealistic to suppose that standard item response theory (IRT) models will be appropriate for all new and currently considered computer-based tests. In addition to developing new models, researchers will need to give some attention to the possibility of constructing and analyzing new tests without the aid of strong models. Computerized…

Descriptors: Adaptive Testing, Algorithms, Computer Assisted Testing, Item Response Theory

Practical Design and Implementation Considerations of a Computer Adaptive Foreign Language Test: The Monash/Melbourne French CAT.

Peer reviewed

Burston, Jack; Monville-Burston, Monique – CALICO Journal, 1995

Describes the academic context in which the "French CAT" was created and trialed and gives a detailed consideration of the test presentation platform and operating algorithms. Finally, the article evaluates the first administration of the test and discusses its reliability and validity as a placement instrument for first-year Australian…

Descriptors: Achievement Tests, Algorithms, College Students, Computer Assisted Testing

A Simple and Fast Item Selection Procedure for Adaptive Testing. Research Report 94-13.

Download full text

Veerkamp, Wim J. J.; Berger, Martijn P. F. – 1994

Items with the highest discrimination parameter values in a logistic item response theory (IRT) model do not necessarily give maximum information. This paper shows which discrimination parameter values (as a function of the guessing parameter and the distance between person ability and item difficulty) give maximum information for the…

Descriptors: Ability, Adaptive Testing, Algorithms, Computer Assisted Testing

IRT-Based Test Construction. Project Psychometric Aspects of Item Banking No. 15. Research Report 87-2.

Download full text

van der Linden, Wim J., Ed. – 1987

Four discussions of test construction based on item response theory (IRT) are presented. The first discussion, "Test Design as Model Building in Mathematical Programming" (T. J. J. M. Theunissen), presents test design as a decision process under certainty. A natural way of modeling this process leads to mathematical programming. General…

Descriptors: Algorithms, Computer Assisted Testing, Decision Making, Foreign Countries

Introduction to Knowledge Spaces: How to Build, Test, and Search Them.

Peer reviewed

Falmagne, Jean-Claude; And Others – Psychological Review, 1990

This article gives a comprehensive description of a theory for efficient assessment of knowledge. The essential concept is that the knowledge state of a subject, with regard to a specified field of information, can be represented by a particular subset of problems that the subject is capable of solving. (SLD)

Descriptors: Algorithms, Educational Assessment, Equations (Mathematics), Evaluation Methods

Towards the Nonstochastic Recalibration of the Teacher Made Test and Project Scoring Protocols as a Measure of Student Performance Magnitudes: The Instructional Effectiveness Coefficient for Group and Individual "Empiricism without Experimentation."

Download full text

Baker, Sheldon R.; And Others – 1995

A paradigm for the recalibration of teacher-made assessment that assesses and evaluates in one operation is formulated. The effort to make the classroom the primary source of educational research activity is contingent on redefining educational research as empirical and not experimental. This emphasizes that the empirical analysis of instructional…

Descriptors: Algorithms, Educational Assessment, Educational Research, Elementary Secondary Education