Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 5 |
Since 2016 (last 10 years) | 11 |
Since 2006 (last 20 years) | 12 |
Descriptor
Multiple Choice Tests | 21 |
Test Construction | 9 |
Test Items | 8 |
Scores | 7 |
Computer Assisted Testing | 5 |
Test Format | 5 |
Psychometrics | 4 |
Scoring | 4 |
Student Evaluation | 4 |
Test Use | 4 |
Comparative Analysis | 3 |
More ▼ |
Source
Educational Measurement:… | 21 |
Author
Wind, Stefanie A. | 2 |
Ahmadi, Alireza | 1 |
Albanese, Mark A. | 1 |
Aray, Henry | 1 |
Armstrong, Anne-Marie | 1 |
Berenbon, Rebecca F. | 1 |
Berry, Yufeng | 1 |
Bridgeman, Brent | 1 |
Burdick, Hal | 1 |
Downing, Steven M. | 1 |
Elmore, Jeff | 1 |
More ▼ |
Publication Type
Journal Articles | 21 |
Reports - Research | 11 |
Reports - Evaluative | 6 |
Information Analyses | 3 |
Speeches/Meeting Papers | 2 |
Opinion Papers | 1 |
Reports - Descriptive | 1 |
Education Level
Elementary Education | 1 |
Elementary Secondary Education | 1 |
Grade 4 | 1 |
Higher Education | 1 |
Intermediate Grades | 1 |
Postsecondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Graduate Record Examinations | 1 |
Test of English as a Foreign… | 1 |
Watson Glaser Critical… | 1 |
What Works Clearinghouse Rating
Berenbon, Rebecca F.; McHugh, Bridget C. – Educational Measurement: Issues and Practice, 2023
To assemble a high-quality test, psychometricians rely on subject matter experts (SMEs) to write high-quality items. However, SMEs are not typically given the opportunity to provide input on which content standards are most suitable for multiple-choice questions (MCQs). In the present study, we explored the relationship between perceived MCQ…
Descriptors: Test Items, Multiple Choice Tests, Standards, Difficulty Level
Ersan, Ozge; Berry, Yufeng – Educational Measurement: Issues and Practice, 2023
The increasing use of computerization in the testing industry and the need for items potentially measuring higher-order skills have led educational measurement communities to develop technology-enhanced (TE) items and conduct validity studies on the use of TE items. Parallel to this goal, the purpose of this study was to collect validity evidence…
Descriptors: Computer Assisted Testing, Multiple Choice Tests, Elementary Secondary Education, Accountability
Xuelan Qiu; Jimmy de la Torre; You-Gan Wang; Jinran Wu – Educational Measurement: Issues and Practice, 2024
Multidimensional forced-choice (MFC) items have been found to be useful to reduce response biases in personality assessments. However, conventional scoring methods for the MFC items result in ipsative data, hindering the wider applications of the MFC format. In the last decade, a number of item response theory (IRT) models have been developed,…
Descriptors: Item Response Theory, Personality Traits, Personality Measures, Personality Assessment
Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021
Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…
Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making
Aray, Henry; Pedauga, Luis – Educational Measurement: Issues and Practice, 2019
This article presents a novel experimental methodology in which groups of students were offered the option to choose between two equivalent scoring rules to assess a multiple-choice test. The effect of choosing the scoring rule on marks is tested. Two major contributions arise from this research. First, it contributes to the literature on the…
Descriptors: Multiple Choice Tests, Scoring, Student Attitudes, Decision Making
Rafatbakhsh, Elaheh; Ahmadi, Alireza; Moloodi, Amirsaeid; Mehrpour, Saeed – Educational Measurement: Issues and Practice, 2021
Test development is a crucial, yet difficult and time-consuming part of any educational system, and the task often falls all on teachers. Automatic item generation systems have recently drawn attention as they can reduce this burden and make test development more convenient. Such systems have been developed to generate items for vocabulary,…
Descriptors: Test Construction, Test Items, Computer Assisted Testing, Multiple Choice Tests
Moon, Jung Aa; Keehner, Madeleine; Katz, Irvin R. – Educational Measurement: Issues and Practice, 2019
The current study investigated how item formats and their inherent affordances influence test-takers' cognition under uncertainty. Adult participants solved content-equivalent math items in multiple-selection multiple-choice and four alternative grid formats. The results indicated that participants' affirmative response tendency (i.e., judge the…
Descriptors: Affordances, Test Items, Test Format, Test Wiseness
Wind, Stefanie A. – Educational Measurement: Issues and Practice, 2017
Mokken scale analysis (MSA) is a probabilistic-nonparametric approach to item response theory (IRT) that can be used to evaluate fundamental measurement properties with less strict assumptions than parametric IRT models. This instructional module provides an introduction to MSA as a probabilistic-nonparametric framework in which to explore…
Descriptors: Probability, Nonparametric Statistics, Item Response Theory, Scaling
Wise, Steven L. – Educational Measurement: Issues and Practice, 2017
The rise of computer-based testing has brought with it the capability to measure more aspects of a test event than simply the answers selected or constructed by the test taker. One behavior that has drawn much research interest is the time test takers spend responding to individual multiple-choice items. In particular, very short response…
Descriptors: Guessing (Tests), Multiple Choice Tests, Test Items, Reaction Time
Bridgeman, Brent – Educational Measurement: Issues and Practice, 2016
Scores on essay-based assessments that are part of standardized admissions tests are typically given relatively little weight in admissions decisions compared to the weight given to scores from multiple-choice assessments. Evidence is presented to suggest that more weight should be given to these assessments. The reliability of the writing scores…
Descriptors: Multiple Choice Tests, Scores, Standardized Tests, Comparative Analysis
Kosh, Audra E.; Greene, Jeffrey A.; Murphy, P. Karen; Burdick, Hal; Firetto, Carla M.; Elmore, Jeff – Educational Measurement: Issues and Practice, 2018
We explored the feasibility of using automated scoring to assess upper-elementary students' reading ability through analysis of transcripts of students' small-group discussions about texts. Participants included 35 fourth-grade students across two classrooms that engaged in a literacy intervention called Quality Talk. During the course of one…
Descriptors: Computer Assisted Testing, Small Group Instruction, Group Discussion, Student Evaluation
NCME 2008 Presidential Address: The Impact of Anchor Test Configuration on Student Proficiency Rates
Fitzpatrick, Anne R. – Educational Measurement: Issues and Practice, 2008
Examined in this study were the effects of reducing anchor test length on student proficiency rates for 12 multiple-choice tests administered in an annual, large-scale, high-stakes assessment. The anchor tests contained 15 items, 10 items, or five items. Five content representative samples of items were drawn at each anchor test length from a…
Descriptors: Test Length, Multiple Choice Tests, Item Sampling, Student Evaluation

Frary, Robert B. – Educational Measurement: Issues and Practice, 1988
Formula scoring is designed to reduce multiple-choice test score irregularities due to guessing. It is inappropriate for most classroom testing, but may be desirable for speeded tests and difficult tests with low passing scores. An annotated bibliography and a Self-Test are provided. (SLD)
Descriptors: Multiple Choice Tests, Scoring, Testing Problems
Rodriguez, Michael C. – Educational Measurement: Issues and Practice, 2005
Multiple-choice items are a mainstay of achievement testing. The need to adequately cover the content domain to certify achievement proficiency by producing meaningful precise scores requires many high-quality items. More 3-option items can be administered than 4- or 5-option items per testing time while improving content coverage, without…
Descriptors: Psychometrics, Testing, Scores, Test Construction

Albanese, Mark A. – Educational Measurement: Issues and Practice, 1993
A comprehensive review is given of evidence, with a bearing on the recommendation to avoid use of complex multiple choice (CMC) items. Avoiding Type K items (four primary responses and five secondary choices) seems warranted, but evidence against CMC in general is less clear. (SLD)
Descriptors: Cues, Difficulty Level, Multiple Choice Tests, Responses
Previous Page | Next Page ยป
Pages: 1 | 2