ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	7
Since 2006 (last 20 years)	12

Descriptor

Psychometrics	13
Test Items	13
Item Response Theory	5
Scores	5
Test Construction	5
Achievement Tests	3
Evaluation	3
Test Bias	3
Testing	3
Cognitive Processes	2
Educational Assessment	2
Equated Scores	2
Foreign Countries	2
High Stakes Tests	2
International Assessment	2
Multiple Choice Tests	2
Scoring	2
Secondary School Students	2
Standards	2
Test Format	2
Test Results	2
Test Wiseness	2
Validity	2
Accountability	1
Adults	1
More ▼

Source

Educational Measurement:…

Publication Type

Journal Articles	13
Reports - Descriptive	5
Reports - Evaluative	4
Reports - Research	4

Education Level

Secondary Education

Audience

Location

United States

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Program for International…	2
ACT Assessment	1
Graduate Record Examinations	1
Preliminary Scholastic…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

An Evaluation of Automatic Item Generation: A Case Study of Weak Theory Approach

Peer reviewed

Direct link

Fu, Yanyan; Choe, Edison M.; Lim, Hwanggyu; Choi, Jaehwa – Educational Measurement: Issues and Practice, 2022

This case study applied the "weak theory" of Automatic Item Generation (AIG) to generate isomorphic item instances (i.e., unique but psychometrically equivalent items) for a large-scale assessment. Three representative instances were selected from each item template (i.e., model) and pilot-tested. In addition, a new analytical framework,…

Descriptors: Test Items, Measurement, Psychometrics, Test Construction

Do Subject Matter Experts' Judgments of Multiple-Choice Format Suitability Predict Item Quality?

Peer reviewed

Direct link

Berenbon, Rebecca F.; McHugh, Bridget C. – Educational Measurement: Issues and Practice, 2023

To assemble a high-quality test, psychometricians rely on subject matter experts (SMEs) to write high-quality items. However, SMEs are not typically given the opportunity to provide input on which content standards are most suitable for multiple-choice questions (MCQs). In the present study, we explored the relationship between perceived MCQ…

Descriptors: Test Items, Multiple Choice Tests, Standards, Difficulty Level

Evaluating Item Fit Statistic Thresholds in PISA: Analysis of Cross-Country Comparability of Cognitive Items

Peer reviewed

Direct link

Joo, Seang-Hwane; Khorramdel, Lale; Yamamoto, Kentaro; Shin, Hyo Jeong; Robin, Frederic – Educational Measurement: Issues and Practice, 2021

In Programme for International Student Assessment (PISA), item response theory (IRT) scaling is used to examine the psychometric properties of items and scales and to provide comparable test scores across participating countries and over time. To balance the comparability of IRT item parameter estimations across countries with the best possible…

Descriptors: Foreign Countries, International Assessment, Achievement Tests, Secondary School Students

Affordances of Item Formats and Their Effects on Test-Taker Cognition under Uncertainty

Peer reviewed

Direct link

Moon, Jung Aa; Keehner, Madeleine; Katz, Irvin R. – Educational Measurement: Issues and Practice, 2019

The current study investigated how item formats and their inherent affordances influence test-takers' cognition under uncertainty. Adult participants solved content-equivalent math items in multiple-selection multiple-choice and four alternative grid formats. The results indicated that participants' affirmative response tendency (i.e., judge the…

Descriptors: Affordances, Test Items, Test Format, Test Wiseness

Understanding Examinees' Responses to Items: Implications for Measurement

Peer reviewed

Direct link

Embretson, Susan E. – Educational Measurement: Issues and Practice, 2016

Examinees' thinking processes have become an increasingly important concern in testing. The responses processes aspect is a major component of validity, and contemporary tests increasingly involve specifications about the cognitive complexity of examinees' response processes. Yet, empirical research findings on examinees' cognitive processes are…

Descriptors: Testing, Cognitive Processes, Test Construction, Test Items

How Robust Are Cross-Country Comparisons of PISA Scores to the Scaling Model Used?

Peer reviewed

Direct link

Jerrim, John; Parker, Philip; Choi, Alvaro; Chmielewski, Anna Katyn; Sälzer, Christine; Shure, Nikki – Educational Measurement: Issues and Practice, 2018

The Programme for International Student Assessment (PISA) is an important international study of 15-olds' knowledge and skills. New results are released every 3 years, and have a substantial impact upon education policy. Yet, despite its influence, the methodology underpinning PISA has received significant criticism. Much of this criticism has…

Descriptors: Educational Assessment, Comparative Education, Achievement Tests, Foreign Countries

A Process for Reviewing and Evaluating Generated Test Items

Peer reviewed

Direct link

Gierl, Mark J.; Lai, Hollis – Educational Measurement: Issues and Practice, 2016

Testing organization needs large numbers of high-quality items due to the proliferation of alternative test administration methods and modern test designs. But the current demand for items far exceeds the supply. Test items, as they are currently written, evoke a process that is both time-consuming and expensive because each item is written,…

Descriptors: Test Items, Test Construction, Psychometrics, Models

The Contestant Perspective on Taking Tests: Emanations from the Statue within

Peer reviewed

Direct link

Dorans, Neil J. – Educational Measurement: Issues and Practice, 2012

Views on testing--its purpose and uses and how its data are analyzed--are related to one's perspective on test takers. Test takers can be viewed as learners, examinees, or contestants. I briefly discuss the perspective of test takers as learners. I maintain that much of psychometrics views test takers as examinees. I discuss test takers as a…

Descriptors: Testing, Test Theory, Item Response Theory, Test Reliability

First Language of Test Takers and Fairness Assessment Procedures

Peer reviewed

Direct link

Sinharay, Sandip; Dorans, Neil J.; Liang, Longjuan – Educational Measurement: Issues and Practice, 2011

Over the past few decades, those who take tests in the United States have exhibited increasing diversity with respect to native language. Standard psychometric procedures for ensuring item and test fairness that have existed for some time were developed when test-taking groups were predominantly native English speakers. A better understanding of…

Descriptors: Test Bias, Testing Programs, Psychometrics, Language Proficiency

Raju's Differential Functioning of Items and Tests (DFIT)

Peer reviewed

Direct link

Oshima, T. C.; Morris, S. B. – Educational Measurement: Issues and Practice, 2008

Nambury S. Raju (1937-2005) developed two model-based indices for differential item functioning (DIF) during his prolific career in psychometrics. Both methods, Raju's area measures (Raju, 1988) and Raju's DFIT (Raju, van der Linden, & Fleer, 1995), are based on quantifying the gap between item characteristic functions (ICFs). This approach…

Descriptors: Test Bias, Psychometrics, Methods, Test Items

Instructional Sensitivity as a Psychometric Property of Assessments

Peer reviewed

Direct link

Polikoff, Morgan S. – Educational Measurement: Issues and Practice, 2010

Standards-based reform, as codified by the No Child Left Behind Act, relies on the ability of assessments to accurately reflect the learning that takes place in U.S. classrooms. However, this property of assessments--their instructional sensitivity--is rarely, if ever, investigated by test developers, states, or researchers. In this paper, the…

Descriptors: Federal Legislation, Psychometrics, Accountability, Teaching Methods

Same-Form Retest Effects on Credentialing Examinations

Peer reviewed

Direct link

Raymond, Mark R.; Neustel, Sandra; Anderson, Dan – Educational Measurement: Issues and Practice, 2009

Examinees who take high-stakes assessments are usually given an opportunity to repeat the test if they are unsuccessful on their initial attempt. To prevent examinees from obtaining unfair score increases by memorizing the content of specific test items, testing agencies usually assign a different test form to repeat examinees. The use of multiple…

Descriptors: Test Results, Test Items, Testing, Aptitude Tests

A Brief History of Item Response Theory.

Peer reviewed

Bock, R. Darrell – Educational Measurement: Issues and Practice, 1997

This brief history traces the development of item response theory (IRT) from concepts originating in 19th-century mathematics and psychology to present-day principles drawn from statistical estimation theory. Connections to other fields and current trends in IRT are outlined. (SLD)

Descriptors: Estimation (Mathematics), History, Item Response Theory, Psychometrics

Dorans, Neil J.	2
Anderson, Dan	1
Berenbon, Rebecca F.	1
Bock, R. Darrell	1
Chmielewski, Anna Katyn	1
Choe, Edison M.	1
Choi, Alvaro	1
Choi, Jaehwa	1
Embretson, Susan E.	1
Fu, Yanyan	1
Gierl, Mark J.	1
Jerrim, John	1
Joo, Seang-Hwane	1
Katz, Irvin R.	1
Keehner, Madeleine	1
Khorramdel, Lale	1
Lai, Hollis	1
Liang, Longjuan	1
Lim, Hwanggyu	1
McHugh, Bridget C.	1
Moon, Jung Aa	1
Morris, S. B.	1
Neustel, Sandra	1
Oshima, T. C.	1
Parker, Philip	1
More ▼