Publication Date
In 2025 | 1 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 13 |
Since 2016 (last 10 years) | 24 |
Since 2006 (last 20 years) | 39 |
Descriptor
Models | 56 |
Multiple Choice Tests | 56 |
Test Items | 56 |
Item Response Theory | 20 |
Test Construction | 20 |
Foreign Countries | 12 |
Difficulty Level | 10 |
Test Format | 10 |
Responses | 9 |
Comparative Analysis | 8 |
Item Analysis | 8 |
More ▼ |
Source
Author
Publication Type
Journal Articles | 40 |
Reports - Research | 34 |
Reports - Evaluative | 15 |
Speeches/Meeting Papers | 8 |
Reports - Descriptive | 5 |
Non-Print Media | 1 |
Reference Materials - General | 1 |
Tests/Questionnaires | 1 |
Education Level
Audience
Location
Canada | 3 |
Iran | 3 |
Taiwan | 2 |
California | 1 |
Europe | 1 |
Germany | 1 |
Indonesia | 1 |
Sweden | 1 |
Turkey | 1 |
United States | 1 |
Wisconsin | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Aiman Mohammad Freihat; Omar Saleh Bani Yassin – Educational Process: International Journal, 2025
Background/purpose: This study aimed to reveal the accuracy of estimation of multiple-choice test items parameters following the models of the item-response theory in measurement. Materials/methods: The researchers depended on the measurement accuracy indicators, which express the absolute difference between the estimated and actual values of the…
Descriptors: Accuracy, Computation, Multiple Choice Tests, Test Items
Olney, Andrew M. – Grantee Submission, 2022
Multi-angle question answering models have recently been proposed that promise to perform related tasks like question generation. However, performance on related tasks has not been thoroughly studied. We investigate a leading model called Macaw on the task of multiple choice question generation and evaluate its performance on three angles that…
Descriptors: Test Construction, Multiple Choice Tests, Test Items, Models
Andrew M. Olney – Grantee Submission, 2023
Multiple choice questions are traditionally expensive to produce. Recent advances in large language models (LLMs) have led to fine-tuned LLMs that generate questions competitive with human-authored questions. However, the relative capabilities of ChatGPT-family models have not yet been established for this task. We present a carefully-controlled…
Descriptors: Test Construction, Multiple Choice Tests, Test Items, Algorithms
Davison, Mark L.; Davenport, Ernest C., Jr.; Jia, Hao; Seipel, Ben; Carlson, Sarah E. – Grantee Submission, 2022
A regression model of predictor trade-offs is described. Each regression parameter equals the expected change in Y obtained by trading 1 point from one predictor to a second predictor. The model applies to predictor variables that sum to a constant T for all observations; for example, proportions summing to T=1.0 or percentages summing to T=100…
Descriptors: Regression (Statistics), Prediction, Predictor Variables, Models
Mead, Alan D.; Zhou, Chenxuan – Journal of Applied Testing Technology, 2022
This study fit a Naïve Bayesian classifier to the words of exam items to predict the Bloom's taxonomy level of the items. We addressed five research questions, showing that reasonably good prediction of Bloom's level was possible, but accuracy varies across levels. In our study, performance for Level 2 was poor (Level 2 items were misclassified…
Descriptors: Artificial Intelligence, Prediction, Taxonomy, Natural Language Processing
Lawrence T. DeCarlo – Educational and Psychological Measurement, 2024
A psychological framework for different types of items commonly used with mixed-format exams is proposed. A choice model based on signal detection theory (SDT) is used for multiple-choice (MC) items, whereas an item response theory (IRT) model is used for open-ended (OE) items. The SDT and IRT models are shown to share a common conceptualization…
Descriptors: Test Format, Multiple Choice Tests, Item Response Theory, Models
Hansen, John; Stewart, John – Physical Review Physics Education Research, 2021
This work is the fourth of a series of papers applying multidimensional item response theory (MIRT) to widely used physics conceptual assessments. This study applies MIRT analysis using both exploratory and confirmatory methods to the Brief Electricity and Magnetism Assessment (BEMA) to explore the assessment's structure and to determine a…
Descriptors: Item Response Theory, Science Tests, Energy, Magnets
Abu-Ghazalah, Rashid M.; Dubins, David N.; Poon, Gregory M. K. – Applied Measurement in Education, 2023
Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly…
Descriptors: Guessing (Tests), Multiple Choice Tests, Probability, Models
Rao, Dhawaleswar; Saha, Sujan Kumar – IEEE Transactions on Learning Technologies, 2020
Automatic multiple choice question (MCQ) generation from a text is a popular research area. MCQs are widely accepted for large-scale assessment in various domains and applications. However, manual generation of MCQs is expensive and time-consuming. Therefore, researchers have been attracted toward automatic MCQ generation since the late 90's.…
Descriptors: Multiple Choice Tests, Test Construction, Automation, Computer Software
Laliyo, Lukman Abdul Rauf; Hamdi, Syukrul; Pikoli, Masrid; Abdullah, Romario; Panigoro, Citra – European Journal of Educational Research, 2021
One of the issues that hinder the students' learning progress is the inability to construct an epistemological explanation of a scientific phenomenon. Four-tier multiple-choice (hereinafter, 4TMC) instrument and Partial-Credit Model were employed to elaborate on the diagnosis process of the aforementioned problem. This study was to develop and…
Descriptors: Learning Processes, Multiple Choice Tests, Models, Test Items
Wu, Qian; De Laet, Tinne; Janssen, Rianne – Journal of Educational Measurement, 2019
Single-best answers to multiple-choice items are commonly dichotomized into correct and incorrect responses, and modeled using either a dichotomous item response theory (IRT) model or a polytomous one if differences among all response options are to be retained. The current study presents an alternative IRT-based modeling approach to…
Descriptors: Multiple Choice Tests, Item Response Theory, Test Items, Responses
Langbeheim, Elon; Ben-Eliyahu, Einat; Adadan, Emine; Akaygun, Sevil; Ramnarain, Umesh Dewnarain – Chemistry Education Research and Practice, 2022
Learning progressions (LPs) are novel models for the development of assessments in science education, that often use a scale to categorize students' levels of reasoning. Pictorial representations are important in chemistry teaching and learning, and also in LPs, but the differences between pictorial and verbal items in chemistry LPs is unclear. In…
Descriptors: Science Instruction, Learning Trajectories, Chemistry, Thinking Skills
Liao, Xiangyi; Bolt, Daniel M. – Journal of Educational and Behavioral Statistics, 2021
Four-parameter models have received increasing psychometric attention in recent years, as a reduced upper asymptote for item characteristic curves can be appealing for measurement applications such as adaptive testing and person-fit assessment. However, applications can be challenging due to the large number of parameters in the model. In this…
Descriptors: Test Items, Models, Mathematics Tests, Item Response Theory
Panahi, Ali; Mohebbi, Hassan – Language Teaching Research Quarterly, 2022
High stakes testing, such as IELTS, is designed to select individuals for decision-making purposes (Fulcher, 2013b). Hence, there is a slow-growing stream of research investigating the subskills of IELTS listening and, in feedback terms, its effects on individuals and educational programs. Here, cognitive diagnostic assessment (CDA) performs it…
Descriptors: Decision Making, Listening Comprehension Tests, Multiple Choice Tests, Diagnostic Tests
Andrich, David; Marais, Ida – Journal of Educational Measurement, 2018
Even though guessing biases difficulty estimates as a function of item difficulty in the dichotomous Rasch model, assessment programs with tests which include multiple-choice items often construct scales using this model. Research has shown that when all items are multiple-choice, this bias can largely be eliminated. However, many assessments have…
Descriptors: Multiple Choice Tests, Test Items, Guessing (Tests), Test Bias