Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 6 |
Since 2006 (last 20 years) | 8 |
Descriptor
Test Items | 15 |
Test Validity | 15 |
Test Construction | 10 |
Item Analysis | 6 |
Computer Assisted Testing | 4 |
Scores | 4 |
Testing Problems | 4 |
Achievement Tests | 3 |
Evaluation Methods | 3 |
Scoring | 3 |
Standards | 3 |
More ▼ |
Source
Educational Measurement:… | 15 |
Author
An, Lily Shiao | 1 |
Arslan, Burcu | 1 |
Bond, Lloyd | 1 |
Bottsford-Miller, Nicole A. | 1 |
Carter, Kathy | 1 |
Davis, Laurie Laughlin | 1 |
Drasgow, Fritz | 1 |
Frisbie, David A. | 1 |
Gierl, Mark J. | 1 |
Gong, Brian | 1 |
Gong, Tao | 1 |
More ▼ |
Publication Type
Journal Articles | 15 |
Reports - Research | 6 |
Reports - Descriptive | 4 |
Reports - Evaluative | 3 |
Opinion Papers | 2 |
Guides - Non-Classroom | 1 |
Information Analyses | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
SAT (College Admission Test) | 1 |
Stanford Achievement Tests | 1 |
What Works Clearinghouse Rating
Guher Gorgun; Okan Bulut – Educational Measurement: Issues and Practice, 2025
Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for…
Descriptors: Artificial Intelligence, Quality Control, Technology Uses in Education, Automation
Student, Sanford R.; Gong, Brian – Educational Measurement: Issues and Practice, 2022
We address two persistent challenges in large-scale assessments of the Next Generation Science Standards: (a) the validity of score interpretations that target the standards broadly and (b) how to structure claims for assessments of this complex domain. The NGSS pose a particular challenge for specifying claims about students that evidence from…
Descriptors: Science Tests, Test Validity, Test Items, Test Construction
An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022
Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…
Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies
Arslan, Burcu; Jiang, Yang; Keehner, Madeleine; Gong, Tao; Katz, Irvin R.; Yan, Fred – Educational Measurement: Issues and Practice, 2020
Computer-based educational assessments often include items that involve drag-and-drop responses. There are different ways that drag-and-drop items can be laid out and different choices that test developers can make when designing these items. Currently, these decisions are based on experts' professional judgments and design constraints, rather…
Descriptors: Test Items, Computer Assisted Testing, Test Format, Decision Making
Wise, Steven L. – Educational Measurement: Issues and Practice, 2017
The rise of computer-based testing has brought with it the capability to measure more aspects of a test event than simply the answers selected or constructed by the test taker. One behavior that has drawn much research interest is the time test takers spend responding to individual multiple-choice items. In particular, very short response…
Descriptors: Guessing (Tests), Multiple Choice Tests, Test Items, Reaction Time
Gierl, Mark J.; Lai, Hollis – Educational Measurement: Issues and Practice, 2016
Testing organization needs large numbers of high-quality items due to the proliferation of alternative test administration methods and modern test designs. But the current demand for items far exceeds the supply. Test items, as they are currently written, evoke a process that is both time-consuming and expensive because each item is written,…
Descriptors: Test Items, Test Construction, Psychometrics, Models
Johnstone, Christopher J.; Thompson, Sandra J.; Bottsford-Miller, Nicole A.; Thurlow, Martha L. – Educational Measurement: Issues and Practice, 2008
Test items undergo multiple iterations of review before states and vendors deem them acceptable to be placed in a live statewide assessment. This article reviews three approaches that can add validity evidence to states' item review processes. The first process is a structured sensitivity review process that focuses on universal design…
Descriptors: Test Items, Disabilities, Test Construction, Testing Programs
Lu, Ying; Sireci, Stephen G. – Educational Measurement: Issues and Practice, 2007
Speededness refers to the situation where the time limits on a standardized test do not allow substantial numbers of examinees to fully consider all test items. When tests are not intended to measure speed of responding, speededness introduces a severe threat to the validity of interpretations based on test scores. In this article, we describe…
Descriptors: Test Items, Timed Tests, Standardized Tests, Test Validity

Bond, Lloyd – Educational Measurement: Issues and Practice, 1987
This article suggests that mechanical application of Golden Rule-like procedures is inappropriate. The fundamental idea embodied in them, namely, that of taking issues of equity into account in test construction, may reasonably be done without doing violence to test validity. (JAZ)
Descriptors: Court Litigation, Item Analysis, Minority Groups, Standards

Wilson, Sandra Meachan; Hiscox, Michael D. – Educational Measurement: Issues and Practice, 1984
This article presents a model that can be used by local school districts for reanalyzing standardized test results to obtain a more valid assessment of local learning objectives can be used to identify strengths/weaknesses of existing programs as well as individual students. (EGS)
Descriptors: Educational Objectives, Item Analysis, Models, School Districts

Carter, Kathy – Educational Measurement: Issues and Practice, 1986
This article discusses the validity issue in teacher-made tests. Seventh-grade students' comments about their responses to a test designed to illustrate faulty items suggests students are quite proficient in using secondary clues to figure out correct answers. Teacher comments suggest teachers are unaware they provide such clues. (Author/JAZ)
Descriptors: Cues, Grade 7, Item Analysis, Junior High Schools

Linn, Robert L.; Drasgow, Fritz – Educational Measurement: Issues and Practice, 1987
This article discusses the application of the Golden Rule procedure to items of the Scholastic Aptitude Test. Using item response theory, the analyses indicate that the Golden Rule procedures are ineffective in detecting biased items and may undermine the reliability and validity of tests. (Author/JAZ)
Descriptors: College Entrance Examinations, Difficulty Level, Item Analysis, Latent Trait Theory

Frisbie, David A. – Educational Measurement: Issues and Practice, 1992
Literature related to the multiple true-false (MTF) item format is reviewed. Each answer cluster of a MTF item may have several true items and the correctness of each is judged independently. MTF tests appear efficient and reliable, although they are a bit harder than multiple choice items for examinees. (SLD)
Descriptors: Achievement Tests, Difficulty Level, Literature Reviews, Multiple Choice Tests

Yen, Wendy M.; And Others – Educational Measurement: Issues and Practice, 1987
This paper discusses how to maintain the integrity of national nomative information for achievement tests when the test that is administered has been customized to satisfy local needs and is not a test that has been nationally normed. Alternative procedures for item selection and calibration are examined. (Author/LMO)
Descriptors: Achievement Tests, Elementary Secondary Education, Goodness of Fit, Item Analysis

Jolly, S. Jean; Gramenz, Gary W. – Educational Measurement: Issues and Practice, 1984
A norm-referenced achievement test, in combination with supplementary items, can be used to produce norm-referenced data as well as objective-referenced data. The experiences of the Palm Beach County (Florida) school district in developing and using such a test are described. (EGS)
Descriptors: Achievement Tests, Criterion Referenced Tests, Elementary Secondary Education, Item Analysis