ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	5
Since 2016 (last 10 years)	11
Since 2006 (last 20 years)	12

Descriptor

Multiple Choice Tests	21
Test Construction	9
Test Items	8
Scores	7
Computer Assisted Testing	5
Test Format	5
Psychometrics	4
Scoring	4
Student Evaluation	4
Test Use	4
Comparative Analysis	3
Difficulty Level	3
Elementary Secondary Education	3
Evaluation Methods	3
Item Response Theory	3
Mathematics Tests	3
Test Validity	3
Accountability	2
Achievement Tests	2
Comparative Testing	2
Decision Making	2
Educational Assessment	2
English (Second Language)	2
Language Tests	2
Objective Tests	2
More ▼

Source

Educational Measurement:…

Publication Type

Journal Articles	21
Reports - Research	11
Reports - Evaluative	6
Information Analyses	3
Speeches/Meeting Papers	2
Opinion Papers	1
Reports - Descriptive	1

Education Level

Elementary Education	1
Elementary Secondary Education	1
Grade 4	1
Higher Education	1
Intermediate Grades	1
Postsecondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	1
Test of English as a Foreign…	1
Watson Glaser Critical…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

Do Subject Matter Experts' Judgments of Multiple-Choice Format Suitability Predict Item Quality?

Peer reviewed

Direct link

Berenbon, Rebecca F.; McHugh, Bridget C. – Educational Measurement: Issues and Practice, 2023

To assemble a high-quality test, psychometricians rely on subject matter experts (SMEs) to write high-quality items. However, SMEs are not typically given the opportunity to provide input on which content standards are most suitable for multiple-choice questions (MCQs). In the present study, we explored the relationship between perceived MCQ…

Descriptors: Test Items, Multiple Choice Tests, Standards, Difficulty Level

Measurement Efficiency for Technology-Enhanced and Multiple-Choice Items in a K-12 Mathematics Accountability Assessment

Peer reviewed

Direct link

Ersan, Ozge; Berry, Yufeng – Educational Measurement: Issues and Practice, 2023

The increasing use of computerization in the testing industry and the need for items potentially measuring higher-order skills have led educational measurement communities to develop technology-enhanced (TE) items and conduct validity studies on the use of TE items. Parallel to this goal, the purpose of this study was to collect validity evidence…

Descriptors: Computer Assisted Testing, Multiple Choice Tests, Elementary Secondary Education, Accountability

Item Response Theory Models for Polytomous Multidimensional Forced-Choice Items to Measure Construct Differentiation

Peer reviewed

Direct link

Xuelan Qiu; Jimmy de la Torre; You-Gan Wang; Jinran Wu – Educational Measurement: Issues and Practice, 2024

Multidimensional forced-choice (MFC) items have been found to be useful to reduce response biases in personality assessments. However, conventional scoring methods for the MFC items result in ipsative data, hindering the wider applications of the MFC format. In the last decade, a number of item response theory (IRT) models have been developed,…

Descriptors: Item Response Theory, Personality Traits, Personality Measures, Personality Assessment

A Model-Data-Fit-Informed Approach to Score Resolution in Performance Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021

Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…

Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making

The Value of Choice: An Experiment Using Multiple-Choice Tests

Peer reviewed

Direct link

Aray, Henry; Pedauga, Luis – Educational Measurement: Issues and Practice, 2019

This article presents a novel experimental methodology in which groups of students were offered the option to choose between two equivalent scoring rules to assess a multiple-choice test. The effect of choosing the scoring rule on marks is tested. Two major contributions arise from this research. First, it contributes to the literature on the…

Descriptors: Multiple Choice Tests, Scoring, Student Attitudes, Decision Making

Development and Validation of an Automatic Item Generation System for English Idioms

Peer reviewed

Direct link

Rafatbakhsh, Elaheh; Ahmadi, Alireza; Moloodi, Amirsaeid; Mehrpour, Saeed – Educational Measurement: Issues and Practice, 2021

Test development is a crucial, yet difficult and time-consuming part of any educational system, and the task often falls all on teachers. Automatic item generation systems have recently drawn attention as they can reduce this burden and make test development more convenient. Such systems have been developed to generate items for vocabulary,…

Descriptors: Test Construction, Test Items, Computer Assisted Testing, Multiple Choice Tests

Affordances of Item Formats and Their Effects on Test-Taker Cognition under Uncertainty

Peer reviewed

Direct link

Moon, Jung Aa; Keehner, Madeleine; Katz, Irvin R. – Educational Measurement: Issues and Practice, 2019

The current study investigated how item formats and their inherent affordances influence test-takers' cognition under uncertainty. Adult participants solved content-equivalent math items in multiple-selection multiple-choice and four alternative grid formats. The results indicated that participants' affirmative response tendency (i.e., judge the…

Descriptors: Affordances, Test Items, Test Format, Test Wiseness

An Instructional Module on Mokken Scale Analysis

Peer reviewed

Direct link

Wind, Stefanie A. – Educational Measurement: Issues and Practice, 2017

Mokken scale analysis (MSA) is a probabilistic-nonparametric approach to item response theory (IRT) that can be used to evaluate fundamental measurement properties with less strict assumptions than parametric IRT models. This instructional module provides an introduction to MSA as a probabilistic-nonparametric framework in which to explore…

Descriptors: Probability, Nonparametric Statistics, Item Response Theory, Scaling

Rapid-Guessing Behavior: Its Identification, Interpretation, and Implications

Peer reviewed

Direct link

Wise, Steven L. – Educational Measurement: Issues and Practice, 2017

The rise of computer-based testing has brought with it the capability to measure more aspects of a test event than simply the answers selected or constructed by the test taker. One behavior that has drawn much research interest is the time test takers spend responding to individual multiple-choice items. In particular, very short response…

Descriptors: Guessing (Tests), Multiple Choice Tests, Test Items, Reaction Time

Can a Two-Question Test Be Reliable and Valid for Predicting Academic Outcomes?

Peer reviewed

Direct link

Bridgeman, Brent – Educational Measurement: Issues and Practice, 2016

Scores on essay-based assessments that are part of standardized admissions tests are typically given relatively little weight in admissions decisions compared to the weight given to scores from multiple-choice assessments. Evidence is presented to suggest that more weight should be given to these assessments. The reliability of the writing scores…

Descriptors: Multiple Choice Tests, Scores, Standardized Tests, Comparative Analysis

Automated Scoring of Students' Small-Group Discussions to Assess Reading Ability

Peer reviewed

Direct link

Kosh, Audra E.; Greene, Jeffrey A.; Murphy, P. Karen; Burdick, Hal; Firetto, Carla M.; Elmore, Jeff – Educational Measurement: Issues and Practice, 2018

We explored the feasibility of using automated scoring to assess upper-elementary students' reading ability through analysis of transcripts of students' small-group discussions about texts. Participants included 35 fourth-grade students across two classrooms that engaged in a literacy intervention called Quality Talk. During the course of one…

Descriptors: Computer Assisted Testing, Small Group Instruction, Group Discussion, Student Evaluation

NCME 2008 Presidential Address: The Impact of Anchor Test Configuration on Student Proficiency Rates

Peer reviewed

Direct link

Fitzpatrick, Anne R. – Educational Measurement: Issues and Practice, 2008

Examined in this study were the effects of reducing anchor test length on student proficiency rates for 12 multiple-choice tests administered in an annual, large-scale, high-stakes assessment. The anchor tests contained 15 items, 10 items, or five items. Five content representative samples of items were drawn at each anchor test length from a…

Descriptors: Test Length, Multiple Choice Tests, Item Sampling, Student Evaluation

NCME Instructional Module: Formula Scoring of Multiple-Choice Tests (Correction for Guessing).

Peer reviewed

Frary, Robert B. – Educational Measurement: Issues and Practice, 1988

Formula scoring is designed to reduce multiple-choice test score irregularities due to guessing. It is inappropriate for most classroom testing, but may be desirable for speeded tests and difficult tests with low passing scores. An annotated bibliography and a Self-Test are provided. (SLD)

Descriptors: Multiple Choice Tests, Scoring, Testing Problems

Three Options Are Optimal for Multiple-Choice Items: A Meta-Analysis of 80 Years of Research

Peer reviewed

Direct link

Rodriguez, Michael C. – Educational Measurement: Issues and Practice, 2005

Multiple-choice items are a mainstay of achievement testing. The need to adequately cover the content domain to certify achievement proficiency by producing meaningful precise scores requires many high-quality items. More 3-option items can be administered than 4- or 5-option items per testing time while improving content coverage, without…

Descriptors: Psychometrics, Testing, Scores, Test Construction

Type K and Other Complex Multiple-Choice Items: An Analysis of Research and Item Properties.

Peer reviewed

Albanese, Mark A. – Educational Measurement: Issues and Practice, 1993

A comprehensive review is given of evidence, with a bearing on the recommendation to avoid use of complex multiple choice (CMC) items. Avoiding Type K items (four primary responses and five secondary choices) seems warranted, but evidence against CMC in general is less clear. (SLD)

Descriptors: Cues, Difficulty Level, Multiple Choice Tests, Responses

Previous Page | Next Page »

Pages: 1 | 2

Wind, Stefanie A.	2
Ahmadi, Alireza	1
Albanese, Mark A.	1
Aray, Henry	1
Armstrong, Anne-Marie	1
Berenbon, Rebecca F.	1
Berry, Yufeng	1
Bridgeman, Brent	1
Burdick, Hal	1
Downing, Steven M.	1
Elmore, Jeff	1
Ersan, Ozge	1
Fan, Meichu	1
Firetto, Carla M.	1
Fitzpatrick, Anne R.	1
Frary, Robert B.	1
Frisbie, David A.	1
Greene, Jeffrey A.	1
Jimmy de la Torre	1
Jinran Wu	1
Katz, Irvin R.	1
Keehner, Madeleine	1
Kosh, Audra E.	1
McHugh, Bridget C.	1
Mehrens, William A.	1
More ▼