ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	3

Descriptor

Comparative Analysis	8
Test Items	8
Test Construction	4
Item Response Theory	3
Test Format	3
Educational Assessment	2
Equated Scores	2
Foreign Countries	2
International Education	2
Scores	2
Bayesian Statistics	1
Bilingualism	1
College Entrance Examinations	1
Correlation	1
Cross Cultural Studies	1
Cutting Scores	1
Difficulty Level	1
English (Second Language)	1
Evaluation Methods	1
Graduate Study	1
Graphs	1
Interviews	1
Language	1
Language Tests	1
Learning Modules	1
More ▼

Source

Educational Measurement:…

Author

Bridgeman, Brent	1
Hambleton, Ronald K.	1
Jones, Russell W.	1
Loomis, Susan Cooper	1
O'Leary, Michael	1
Senturk, Deniz	1
Sinharay, Sandip	1
Sireci, Stephen G.	1
Wainer, Howard	1
Wang, Joyce	1
Wyse, Adam E.	1
Zwick, Rebecca	1
More ▼

Publication Type

Journal Articles	8
Reports - Evaluative	3
Reports - Research	3
Reports - Descriptive	2
Guides - Non-Classroom	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Location

Canada	1
Ireland	1
Israel	1

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	1
National Assessment of…	1
Test of English as a Foreign…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 8 results Save | Export

On the Choice of Anchor Tests in Equating

Peer reviewed

Direct link

Sinharay, Sandip – Educational Measurement: Issues and Practice, 2018

The choice of anchor tests is crucial in applications of the nonequivalent groups with anchor test design of equating. Sinharay and Holland (2006, 2007) suggested "miditests," which are anchor tests that are content-representative and have the same mean item difficulty as the total test but have a smaller spread of item difficulties.…

Descriptors: Test Content, Difficulty Level, Test Items, Test Construction

Five Methods for Estimating Angoff Cut Scores with IRT

Peer reviewed

Direct link

Wyse, Adam E. – Educational Measurement: Issues and Practice, 2017

This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test…

Descriptors: Cutting Scores, Item Response Theory, Bayesian Statistics, Maximum Likelihood Statistics

Can a Two-Question Test Be Reliable and Valid for Predicting Academic Outcomes?

Peer reviewed

Direct link

Bridgeman, Brent – Educational Measurement: Issues and Practice, 2016

Scores on essay-based assessments that are part of standardized admissions tests are typically given relatively little weight in admissions decisions compared to the weight given to scores from multiple-choice assessments. Evidence is presented to suggest that more weight should be given to these assessments. The reliability of the writing scores…

Descriptors: Multiple Choice Tests, Scores, Standardized Tests, Comparative Analysis

An Investigation of Alternative Methods for Item Mapping in the National Assessment of Educational Progress.

Peer reviewed

Zwick, Rebecca; Senturk, Deniz; Wang, Joyce; Loomis, Susan Cooper – Educational Measurement: Issues and Practice, 2001

Compared four mapping item methods using data from the physical science test of the National Assessment of Educational Progress and studied the opinions of science content area experts about the difficulty of the items through a survey completed by 148 science teachers or scientists. Results of model-based mapping methods were more concordant with…

Descriptors: Comparative Analysis, Physical Sciences, Science Teachers, Science Tests

Comparing the Incomparable: An Essay on the Importance of Big Assumptions and Scant Evidence.

Peer reviewed

Wainer, Howard – Educational Measurement: Issues and Practice, 1999

Discusses the comparison of groups of individuals who were administered different forms of a test. Focuses on the situation in which there is little overlap in content between the test forms. Reviews equating problems in national tests in Canada and Israel. (SLD)

Descriptors: Comparative Analysis, Equated Scores, Foreign Countries, National Competency Tests

An NCME Instructional Module on Comparison of Classical Test Theory and Item Response Theory and Their Applications to Test Development.

Peer reviewed

Hambleton, Ronald K.; Jones, Russell W. – Educational Measurement: Issues and Practice, 1993

This National Council on Measurement in Education (NCME) instructional module compares classical test theory and item response theory and describes their applications in test development. Related concepts, models, and methods are explored; and advantages and disadvantages of each framework are reviewed. (SLD)

Descriptors: Comparative Analysis, Educational Assessment, Graphs, Item Response Theory

Stability of Country Rankings across Item Formats in the Third International Mathematics and Science Study.

Peer reviewed

O'Leary, Michael – Educational Measurement: Issues and Practice, 2002

Examined the performance of Irish students on multiple-choice, short-answer, and extended-response item sets from the Third International Mathematics and Science Study to determine whether Ireland's relative rank among the more than 40 countries involved remained stable. Findings provide additional evidence that comparing student achievement…

Descriptors: Comparative Analysis, Foreign Countries, International Education, Mathematics Achievement

Problems and Issues in Linking Assessments across Languages.

Peer reviewed

Sireci, Stephen G. – Educational Measurement: Issues and Practice, 1997

Different methodologies for linking tests across languages are reviewed and evaluated, focusing on monolingual item response theory, bilingual group designs, and matched monolingual group designs. These methods, although not without weaknesses, are superior for promoting score comparability than methods that rely on translation or expert judgment…

Descriptors: Bilingualism, Comparative Analysis, Cross Cultural Studies, Educational Assessment