ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	5
Since 2016 (last 10 years)	12
Since 2006 (last 20 years)	24

Descriptor

Multiple Choice Tests	39
Test Items	39
Test Construction	13
Item Response Theory	11
Test Format	11
Difficulty Level	7
Mathematics Tests	7
Item Analysis	6
Scores	6
Test Reliability	6
High School Students	5
Responses	5
Science Tests	5
Scoring	5
Ability	4
Classification	4
High Schools	4
High Stakes Tests	4
Test Bias	4
Achievement Tests	3
College Entrance Examinations	3
Computer Assisted Testing	3
Constructed Response	3
Equated Scores	3
Error Patterns	3
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	39
Reports - Research	26
Reports - Evaluative	12
Information Analyses	3

Education Level

Higher Education	6
Postsecondary Education	6
Middle Schools	5
Elementary Education	4
Elementary Secondary Education	4
Secondary Education	4
Grade 5	3
Grade 7	3
Grade 8	3
Junior High Schools	3
High Schools	2
Intermediate Grades	2
Grade 10	1
Grade 11	1
Grade 3	1
Grade 4	1
More ▼

Audience

Location

Canada	2
Arizona	1
Germany	1
Hawaii	1
Idaho	1
Indiana	1
Iowa	1
Massachusetts	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Advanced Placement…	1
Iowa Tests of Basic Skills	1
Iowa Tests of Educational…	1
TerraNova Multiple Assessments	1

What Works Clearinghouse Rating

Showing 1 to 15 of 39 results Save | Export

Item-Writing Guidelines on Response Option Placement: A Systematic Review

Peer reviewed

Direct link

Séverin Lions; María Paz Blanco; Pablo Dartnell; Carlos Monsalve; Gabriel Ortega; Julie Lemarié – Applied Measurement in Education, 2024

Multiple-choice items are universally used in formal education. Since they should assess learning, not test-wiseness or guesswork, they must be constructed following the highest possible standards. Hundreds of item-writing guides have provided guidelines to help test developers adopt appropriate strategies to define the distribution and sequence…

Descriptors: Test Construction, Multiple Choice Tests, Guidelines, Test Items

Modeling Dimensions Converging at the Upper Anchor in Learning Progressions: An Example of Micro-Evolution

Peer reviewed

Direct link

Mingfeng Xue; Mark Wilson – Applied Measurement in Education, 2024

Multidimensionality is common in psychological and educational measurements. This study focuses on dimensions that converge at the upper anchor (i.e. the highest acquisition status defined in a learning progression) and compares different ways of dealing with them using the multidimensional random coefficients multinomial logit model and scale…

Descriptors: Learning Trajectories, Educational Assessment, Item Response Theory, Evolution

Detection of Outliers in Anchor Items Using Modified Rasch Fit Statistics

Peer reviewed

Direct link

Liu, Chunyan; Jurich, Daniel; Morrison, Carol; Grabovsky, Irina – Applied Measurement in Education, 2021

The existence of outliers in the anchor items can be detrimental to the estimation of examinee ability and undermine the validity of score interpretation across forms. However, in practice, anchor item performance can become distorted due to various reasons. This study compares the performance of modified "INFIT" and "OUTFIT"…

Descriptors: Equated Scores, Test Items, Item Response Theory, Difficulty Level

Does the Response Options Placement Provide Clues to the Correct Answers in Multiple-Choice Tests? A Systematic Review

Peer reviewed

Direct link

Lions, Séverin; Monsalve, Carlos; Dartnell, Pablo; Blanco, María Paz; Ortega, Gabriel; Lemarié, Julie – Applied Measurement in Education, 2022

Multiple-choice tests are widely used in education, often for high-stakes assessment purposes. Consequently, these tests should be constructed following the highest standards. Many efforts have been undertaken to advance item-writing guidelines intended to improve tests. One important issue is the unwanted effects of the options' position on test…

Descriptors: Multiple Choice Tests, High Stakes Tests, Test Construction, Guidelines

Effect of Sample Size on Common Item Equating Using the Dichotomous Rasch Model

Peer reviewed

Direct link

O'Neill, Thomas R.; Gregg, Justin L.; Peabody, Michael R. – Applied Measurement in Education, 2020

This study addresses equating issues with varying sample sizes using the Rasch model by examining how sample size affects the stability of item calibrations and person ability estimates. A resampling design was used to create 9 sample size conditions (200, 100, 50, 45, 40, 35, 30, 25, and 20), each replicated 10 times. Items were recalibrated…

Descriptors: Sample Size, Equated Scores, Item Response Theory, Raw Scores

Dissecting Knowledge, Guessing, and Blunder in Multiple Choice Assessments

Peer reviewed

Direct link

Abu-Ghazalah, Rashid M.; Dubins, David N.; Poon, Gregory M. K. – Applied Measurement in Education, 2023

Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly…

Descriptors: Guessing (Tests), Multiple Choice Tests, Probability, Models

Application of IRT Fixed Parameter Calibration to Multiple-Group Test Data

Peer reviewed

Direct link

Kim, Seonghoon; Kolen, Michael J. – Applied Measurement in Education, 2019

In applications of item response theory (IRT), fixed parameter calibration (FPC) has been used to estimate the item parameters of a new test form on the existing ability scale of an item pool. The present paper presents an application of FPC to multiple examinee groups test data that are linked to the item pool via anchor items, and investigates…

Descriptors: Item Response Theory, Item Banks, Test Items, Computation

Partial Credit in Answer-Until-Correct Multiple-Choice Tests Deployed in a Classroom Setting

Peer reviewed

Direct link

Slepkov, Aaron D.; Godfrey, Alan T. K. – Applied Measurement in Education, 2019

The answer-until-correct (AUC) method of multiple-choice (MC) testing involves test respondents making selections until the keyed answer is identified. Despite attendant benefits that include improved learning, broad student adoption, and facile administration of partial credit, the use of AUC methods for classroom testing has been extremely…

Descriptors: Multiple Choice Tests, Test Items, Test Reliability, Scores

Are Multiple-Choice Items Too Fat?

Peer reviewed

Direct link

Haladyna, Thomas M.; Rodriguez, Michael C.; Stevens, Craig – Applied Measurement in Education, 2019

The evidence is mounting regarding the guidance to employ more three-option multiple-choice items. From theoretical analyses, empirical results, and practical considerations, such items are of equal or higher quality than four- or five-option items, and more items can be administered to improve content coverage. This study looks at 58 tests,…

Descriptors: Multiple Choice Tests, Test Items, Testing Problems, Guessing (Tests)

Of Small Beauties and Large Beasts: The Quality of Distractors on Multiple-Choice Tests Is More Important than Their Quantity

Peer reviewed

Direct link

Papenberg, Martin; Musch, Jochen – Applied Measurement in Education, 2017

In multiple-choice tests, the quality of distractors may be more important than their number. We therefore examined the joint influence of distractor quality and quantity on test functioning by providing a sample of 5,793 participants with five parallel test sets consisting of items that differed in the number and quality of distractors.…

Descriptors: Multiple Choice Tests, Test Items, Test Validity, Test Reliability

Differential Item Functioning for Accommodated Students with Disabilities: Effect of Differences in Proficiency Distributions

Peer reviewed

Direct link

Quesen, Sarah; Lane, Suzanne – Applied Measurement in Education, 2019

This study examined the effect of similar vs. dissimilar proficiency distributions on uniform DIF detection on a statewide eighth grade mathematics assessment. Results from the similar- and dissimilar-ability reference groups with an SWD focal group were compared for four models: logistic regression, hierarchical generalized linear model (HGLM),…

Descriptors: Test Items, Mathematics Tests, Grade 8, Item Response Theory

An Empirical Comparison of DDF Detection Methods for Understanding the Causes of DIF in Multiple-Choice Items

Peer reviewed

Direct link

Suh, Youngsuk; Talley, Anna E. – Applied Measurement in Education, 2015

This study compared and illustrated four differential distractor functioning (DDF) detection methods for analyzing multiple-choice items. The log-linear approach, two item response theory-model-based approaches with likelihood ratio tests, and the odds ratio approach were compared to examine the congruence among the four DDF detection methods.…

Descriptors: Test Bias, Multiple Choice Tests, Test Items, Methods

Evaluating the Psychometric Characteristics of Generated Multiple-Choice Test Items

Peer reviewed

Direct link

Gierl, Mark J.; Lai, Hollis; Pugh, Debra; Touchie, Claire; Boulais, André-Philippe; De Champlain, André – Applied Measurement in Education, 2016

Item development is a time- and resource-intensive process. Automatic item generation integrates cognitive modeling with computer technology to systematically generate test items. To date, however, items generated using cognitive modeling procedures have received limited use in operational testing situations. As a result, the psychometric…

Descriptors: Psychometrics, Multiple Choice Tests, Test Items, Item Analysis

Science Assessments and English Language Learners: Validity Evidence Based on Response Processes

Peer reviewed

Direct link

Noble, Tracy; Rosebery, Ann; Suarez, Catherine; Warren, Beth; O'Connor, Mary Catherine – Applied Measurement in Education, 2014

English language learners (ELLs) and their teachers, schools, and communities face increasingly high-stakes consequences due to test score gaps between ELLs and non-ELLs. It is essential that the field of educational assessment continue to investigate the meaning of these test score gaps. This article discusses the findings of an exploratory study…

Descriptors: English Language Learners, Evidence, Educational Assessment, Achievement Gap

Determining the Anchor Composition for a Mixed-Format Test: Evaluation of Subpopulation Invariance of Linking Functions

Peer reviewed

Direct link

Kim, Sooyeon; Walker, Michael – Applied Measurement in Education, 2012

This study examined the appropriateness of the anchor composition in a mixed-format test, which includes both multiple-choice (MC) and constructed-response (CR) items, using subpopulation invariance indices. Linking functions were derived in the nonequivalent groups with anchor test (NEAT) design using two types of anchor sets: (a) MC only and (b)…

Descriptors: Multiple Choice Tests, Test Format, Test Items, Equated Scores

Previous Page | Next Page »

Pages: 1 | 2 | 3

Haladyna, Thomas M.	4
Downing, Steven M.	3
Rodriguez, Michael C.	3
DeMars, Christine E.	2
Frary, Robert B.	2
Abu-Ghazalah, Rashid M.	1
Ansley, Timothy	1
Ascalon, M. Evelina	1
Banks, Kathleen	1
Beddow, Peter A.	1
Blanco, María Paz	1
Bolt, Daniel M.	1
Boulais, André-Philippe	1
Brown, Richard S.	1
Carlos Monsalve	1
Chauvin, Sheila W.	1
Dartnell, Pablo	1
Davis, Bruce W.	1
De Champlain, André	1
Dubins, David N.	1
Elliott, Stephen N.	1
Feldt, Leonard S.	1
Gabriel Ortega	1
Gierl, Mark J.	1
More ▼