ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	5
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	19

Descriptor

Test Construction	45
Test Items	45
Multiple Choice Tests	13
Item Response Theory	12
Test Format	12
Difficulty Level	8
Scores	7
Classification	6
Foreign Countries	6
Item Analysis	6
Psychometrics	6
Scoring	6
Test Content	5
Test Validity	5
Validity	5
Achievement Tests	4
Error of Measurement	4
High School Students	4
Higher Education	4
Mathematics Tests	4
Responses	4
College Entrance Examinations	3
College Students	3
Computer Assisted Testing	3
Computer Simulation	3
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	45
Reports - Research	23
Reports - Evaluative	18
Information Analyses	5
Reports - Descriptive	1

Education Level

Elementary Secondary Education	3
High Schools	3
Higher Education	3
Postsecondary Education	3
Secondary Education	3
Elementary Education	2
Grade 8	2
Junior High Schools	2
Middle Schools	2
Grade 11	1
Grade 3	1
Grade 5	1
Grade 6	1
Grade 9	1
More ▼

Audience

Location

Canada	4
Australia	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…	1
Graduate Record Examinations	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 45 results Save | Export

Item-Writing Guidelines on Response Option Placement: A Systematic Review

Peer reviewed

Direct link

Séverin Lions; María Paz Blanco; Pablo Dartnell; Carlos Monsalve; Gabriel Ortega; Julie Lemarié – Applied Measurement in Education, 2024

Multiple-choice items are universally used in formal education. Since they should assess learning, not test-wiseness or guesswork, they must be constructed following the highest possible standards. Hundreds of item-writing guides have provided guidelines to help test developers adopt appropriate strategies to define the distribution and sequence…

Descriptors: Test Construction, Multiple Choice Tests, Guidelines, Test Items

Using Content Relevance and Representativeness Indices in Instrument Revision

Peer reviewed

Direct link

Anne Traynor; Sara C. Christopherson – Applied Measurement in Education, 2024

Combining methods from earlier content validity and more contemporary content alignment studies may allow a more complete evaluation of the meaning of test scores than if either set of methods is used on its own. This article distinguishes item relevance indices in the content validity literature from test representativeness indices in the…

Descriptors: Test Validity, Test Items, Achievement Tests, Test Construction

Using Think-Alouds for Response Process Evidence of Teacher Attentiveness

Peer reviewed

Direct link

Mo, Ya; Carney, Michele; Cavey, Laurie; Totorica, Tatia – Applied Measurement in Education, 2021

There is a need for assessment items that assess complex constructs but can also be efficiently scored for evaluation of teacher education programs. In an effort to measure the construct of teacher attentiveness in an efficient and scalable manner, we are using exemplar responses elicited by constructed-response item prompts to develop…

Descriptors: Protocol Analysis, Test Items, Responses, Mathematics Teachers

Does the Response Options Placement Provide Clues to the Correct Answers in Multiple-Choice Tests? A Systematic Review

Peer reviewed

Direct link

Lions, Séverin; Monsalve, Carlos; Dartnell, Pablo; Blanco, María Paz; Ortega, Gabriel; Lemarié, Julie – Applied Measurement in Education, 2022

Multiple-choice tests are widely used in education, often for high-stakes assessment purposes. Consequently, these tests should be constructed following the highest standards. Many efforts have been undertaken to advance item-writing guidelines intended to improve tests. One important issue is the unwanted effects of the options' position on test…

Descriptors: Multiple Choice Tests, High Stakes Tests, Test Construction, Guidelines

Dissecting Knowledge, Guessing, and Blunder in Multiple Choice Assessments

Peer reviewed

Direct link

Abu-Ghazalah, Rashid M.; Dubins, David N.; Poon, Gregory M. K. – Applied Measurement in Education, 2023

Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly…

Descriptors: Guessing (Tests), Multiple Choice Tests, Probability, Models

Prescribing Structure for Validation Arguments: Elemental, Structural, and Ecological Validity

Peer reviewed

Direct link

Jacobson, Erik; Svetina, Dubravka – Applied Measurement in Education, 2019

Contingent argument-based approaches to validity require a unique argument for each use, in contrast to more prescriptive approaches that identify the common kinds of validity evidence researchers should consider for every use. In this article, we evaluate our use of an approach that is both prescriptive "and" argument-based to develop a…

Descriptors: Test Validity, Test Items, Test Construction, Test Interpretation

The Effect of Anchor Test Construction on Scale Drift

Peer reviewed

Direct link

Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014

In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…

Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory

Evaluating the Psychometric Characteristics of Generated Multiple-Choice Test Items

Peer reviewed

Direct link

Gierl, Mark J.; Lai, Hollis; Pugh, Debra; Touchie, Claire; Boulais, André-Philippe; De Champlain, André – Applied Measurement in Education, 2016

Item development is a time- and resource-intensive process. Automatic item generation integrates cognitive modeling with computer technology to systematically generate test items. To date, however, items generated using cognitive modeling procedures have received limited use in operational testing situations. As a result, the psychometric…

Descriptors: Psychometrics, Multiple Choice Tests, Test Items, Item Analysis

The Effect of Changing Content on IRT Scaling Methods

Peer reviewed

Direct link

Keller, Lisa A.; Keller, Robert R. – Applied Measurement in Education, 2015

Equating test forms is an essential activity in standardized testing, with increased importance with the accountability systems in existence through the mandate of Adequate Yearly Progress. It is through equating that scores from different test forms become comparable, which allows for the tracking of changes in the performance of students from…

Descriptors: Item Response Theory, Rating Scales, Standardized Tests, Scoring Rubrics

Conceptualizing and Measuring Computer and Information Literacy in Cross-National Contexts

Peer reviewed

Direct link

Ainley, John; Fraillon, Julian; Schulz, Wolfram; Gebhardt, Eveline – Applied Measurement in Education, 2016

The development of information technologies has transformed the environment in which young people access, create, and share information. Many countries, having recognized the imperative of digital technology, acknowledge the need to educate young people in the use of these technologies so as to underpin economic and social benefits. This article…

Descriptors: Cross Cultural Studies, Information Literacy, Computer Literacy, Grade 8

An Experimental Test of Student Verbal Reports and Teacher Evaluations as a Source of Validity Evidence for Test Development

Peer reviewed

Direct link

Leighton, Jacqueline P.; Heffernan, Colleen; Cor, M. Kenneth; Gokiert, Rebecca J.; Cui, Ying – Applied Measurement in Education, 2011

The "Standards for Educational and Psychological Testing" indicate that test instructions, and by extension item objectives, presented to examinees should be sufficiently clear and detailed to help ensure that they respond as developers intend them to respond (Standard 3.20; AERA, APA, & NCME, 1999). The present study investigates…

Descriptors: Test Construction, Validity, Evidence, Science Tests

Claims, Evidence, and Achievement-Level Descriptors as a Foundation for Item Design and Test Specifications

Peer reviewed

Direct link

Hendrickson, Amy; Huff, Kristen; Luecht, Richard – Applied Measurement in Education, 2010

Evidence-centered assessment design (ECD) explicates a transparent evidentiary argument to warrant the inferences we make from student test performance. This article describes how the vehicles for gathering student evidence--task models and test specifications--are developed. Task models, which are the basis for item development, flow directly…

Descriptors: Evidence, Test Construction, Measurement, Classification

Validity of the Simultaneous Approach to the Development of Equivalent Achievement Tests in English and French

Peer reviewed

Direct link

Rogers, W. Todd; Lin, Jie; Rinaldi, Christia M. – Applied Measurement in Education, 2011

The evidence gathered in the present study supports the use of the simultaneous development of test items for different languages. The simultaneous approach used in the present study involved writing an item in one language (e.g., French) and, before moving to the development of a second item, translating the item into the second language (e.g.,…

Descriptors: Test Items, Item Analysis, Achievement Tests, French

Estimating Non-Normal Latent Trait Distributions within Item Response Theory Using True and Estimated Item Parameters

Peer reviewed

Direct link

Sass, D. A.; Schmitt, T. A.; Walker, C. M. – Applied Measurement in Education, 2008

Item response theory (IRT) procedures have been used extensively to study normal latent trait distributions and have been shown to perform well; however, less is known concerning the performance of IRT with non-normal latent trait distributions. This study investigated the degree of latent trait estimation error under normal and non-normal…

Descriptors: Difficulty Level, Item Response Theory, Test Items, Computation

Item Position and Item Difficulty Change in an IRT-Based Common Item Equating Design

Peer reviewed

Direct link

Meyers, Jason L.; Miller, G. Edward; Way, Walter D. – Applied Measurement in Education, 2009

In operational testing programs using item response theory (IRT), item parameter invariance is threatened when an item appears in a different location on the live test than it did when it was field tested. This study utilizes data from a large state's assessments to model change in Rasch item difficulty (RID) as a function of item position change,…

Descriptors: Test Items, Test Content, Testing Programs, Simulation

Previous Page | Next Page »

Pages: 1 | 2 | 3

Downing, Steven M.	5
Haladyna, Thomas M.	4
Frary, Robert B.	2
Gierl, Mark J.	2
Sireci, Stephen G.	2
Way, Walter D.	2
Abu-Ghazalah, Rashid M.	1
Ainley, John	1
Anne Traynor	1
Antal, Judit	1
Ascalon, M. Evelina	1
Bandalos, Deborah L.	1
Bennett, Randy Elliot	1
Berberoglu, Giray	1
Blanco, María Paz	1
Boulais, André-Philippe	1
Carlos Monsalve	1
Carlton, Sydell T.	1
Carney, Michele	1
Cavey, Laurie	1
Chang, Lei	1
Chen, Yu-Jen	1
Cheng, Chien-Fen	1
Cor, M. Kenneth	1
More ▼