ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	10

Descriptor

Equated Scores	13
Evaluation Methods	13
Simulation	13
Item Response Theory	6
Test Items	5
Error of Measurement	4
Sampling	4
Statistical Analysis	4
Achievement Tests	3
Test Format	3
Comparative Analysis	2
Evaluation Research	2
Item Analysis	2
Measurement Techniques	2
Sample Size	2
Scores	2
Standardized Tests	2
Test Bias	2
Test Reliability	2
Test Validity	2
College Students	1
Computation	1
Correlation	1
Criteria	1
Data Analysis	1
More ▼

Source

Applied Measurement in…	2
Applied Psychological…	2
ETS Research Report Series	2
Journal of Educational…	2
Educational Sciences: Theory…	1
Educational and Psychological…	1
Journal of Educational and…	1
ProQuest LLC	1

Publication Type

Journal Articles	11
Reports - Research	8
Reports - Evaluative	4
Dissertations/Theses -…	1
Numerical/Quantitative Data	1

Education Level

Elementary Education	2
Elementary Secondary Education	1
Grade 4	1
Grade 8	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Florida

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Florida Comprehensive…	1
SAT (College Admission Test)	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Equating with Small and Unbalanced Samples

Peer reviewed

Direct link

Goodman, Joshua T.; Dallas, Andrew D.; Fan, Fen – Applied Measurement in Education, 2020

Recent research has suggested that re-setting the standard for each administration of a small sample examination, in addition to the high cost, does not adequately maintain similar performance expectations year after year. Small-sample equating methods have shown promise with samples between 20 and 30. For groups that have fewer than 20 students,…

Descriptors: Equated Scores, Sample Size, Sampling, Weighted Scores

A Stepwise Test Characteristic Curve Method to Detect Item Parameter Drift

Peer reviewed

Direct link

Guo, Rui; Zheng, Yi; Chang, Hua-Hua – Journal of Educational Measurement, 2015

An important assumption of item response theory is item parameter invariance. Sometimes, however, item parameters are not invariant across different test administrations due to factors other than sampling error; this phenomenon is termed item parameter drift. Several methods have been developed to detect drifted items. However, most of the…

Descriptors: Item Response Theory, Test Items, Evaluation Methods, Equated Scores

Psychometric Consequences of Subpopulation Item Parameter Drift

Peer reviewed

Direct link

Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2017

This study defines subpopulation item parameter drift (SIPD) as a change in item parameters over time that is dependent on subpopulations of examinees, and hypothesizes that the presence of SIPD in anchor items is associated with bias and/or lack of invariance in three psychometric outcomes. Results show that SIPD in anchor items is associated…

Descriptors: Psychometrics, Test Items, Item Response Theory, Hypothesis Testing

Simulate to Understand Models, Not Nature. Research Report. ETS RR-14-16

Peer reviewed
PDF on ERIC

Download full text

Dorans, Neil J. – ETS Research Report Series, 2014

Simulations are widely used. Simulations produce numbers that are deductive demonstrations of what a model says will happen.They produce numerical results that are consistent with the premises of the model used to generate the numbers. These simulated numerical results are not empirical data that address aspects of the world that lies outside the…

Descriptors: Simulation, Equated Scores, Scores, Scientific Methodology

Robust Scale Transformation Methods in IRT True Score Equating under Common-Item Nonequivalent Groups Design

Direct link

He, Yong – ProQuest LLC, 2013

Common test items play an important role in equating multiple test forms under the common-item nonequivalent groups design. Inconsistent item parameter estimates among common items can lead to large bias in equated scores for IRT true score equating. Current methods extensively focus on detection and elimination of outlying common items, which…

Descriptors: Test Items, Regression (Statistics), Simulation, Comparative Analysis

The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format Test Equating

Peer reviewed
PDF on ERIC

Download full text

Öztürk-Gübes, Nese; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2016

The purpose of this study was to examine the impact of dimensionality, common-item set format, and different scale linking methods on preserving equity property with mixed-format test equating. Item response theory (IRT) true-score equating (TSE) and IRT observed-score equating (OSE) methods were used under common-item nonequivalent groups design.…

Descriptors: Test Format, Item Response Theory, True Scores, Equated Scores

A Comparison of IRT Linking Procedures

Peer reviewed

Direct link

Lee, Won-Chan; Ban, Jae-Chun – Applied Measurement in Education, 2010

Various applications of item response theory often require linking to achieve a common scale for item parameter estimates obtained from different groups. This article used a simulation to examine the relative performance of four different item response theory (IRT) linking procedures in a random groups equating design: concurrent calibration with…

Descriptors: Item Response Theory, Simulation, Comparative Analysis, Measurement Techniques

Using the Kernel Method of Test Equating for Estimating the Standard Errors of Population Invariance Measures

Peer reviewed

Direct link

Moses, Tim – Journal of Educational and Behavioral Statistics, 2008

Equating functions are supposed to be population invariant, meaning that the choice of subpopulation used to compute the equating function should not matter. The extent to which equating functions are population invariant is typically assessed in terms of practical difference criteria that do not account for equating functions' sampling…

Descriptors: Equated Scores, Error of Measurement, Sampling, Evaluation Methods

Anchor Test Type and Population Invariance: An Exploration across Subpopulations and Test Administrations

Peer reviewed

Direct link

Dorans, Neil J.; Liu, Jinghua; Hammond, Shelby – Applied Psychological Measurement, 2008

This exploratory study was built on research spanning three decades. Petersen, Marco, and Stewart (1982) conducted a major empirical investigation of the efficacy of different equating methods. The studies reported in Dorans (1990) examined how different equating methods performed across samples selected in different ways. Recent population…

Descriptors: Test Format, Equated Scores, Sampling, Evaluation Methods

The Effectiveness of Circular Equating as a Criterion for Evaluating Equating.

Peer reviewed

Wang, Tianyou; Hanson, Bradley A.; Harris, Deborah J. – Applied Psychological Measurement, 2000

Studied whether circular equating could provide an adequate measure of various types of equating error when applied to different equating methods under different equating designs. Analyses and simluations show that circular equating is generally invalid as a criterion to evaluate the adequacy of equating. (SLD)

Descriptors: Criteria, Equated Scores, Error of Measurement, Evaluation Methods

Accuracy of Random Groups Equating with Very Small Samples

Peer reviewed

Direct link

Skaggs, Gary – Journal of Educational Measurement, 2005

This study investigated the effectiveness of equating with very small samples using the random groups design. Of particular interest was equating accuracy at specific scores where performance standards might be set. Two sets of simulations were carried out, one in which the two forms were identical and one in which they differed by a tenth of a…

Descriptors: Equated Scores, Simulation, Performance Based Assessment, Evaluation Methods

The Effectiveness of Circular Equating as a Criterion for Evaluating Equating.

Download full text

Wang, Tianyou; Hanson, Bradley A.; Harris, Deborah J. – 1998

Equating a test form to itself through a chain of equatings, commonly referred to as circular equating, has been widely used as a criterion to evaluate the adequacy of equating. This paper uses both analytical methods and simulation methods to show that this criterion is in general invalid in serving this purpose. For the random groups design done…

Descriptors: Equated Scores, Evaluation Methods, Heuristics, Sampling

Choice of Anchor Test in Equating. Research Report. ETS RR-06-35

Peer reviewed
PDF on ERIC

Download full text

Sinharay, Sandip; Holland, Paul – ETS Research Report Series, 2006

It is a widely held belief that anchor tests should be miniature versions (i.e., minitests), with respect to content and statistical characteristics of the tests being equated. This paper examines the foundations for this belief. It examines the requirement of statistical representativeness of anchor tests that are content representative. The…

Descriptors: Test Items, Equated Scores, Evaluation Methods, Difficulty Level

Dorans, Neil J.	2
Hanson, Bradley A.	2
Harris, Deborah J.	2
Wang, Tianyou	2
Ban, Jae-Chun	1
Chang, Hua-Hua	1
Dallas, Andrew D.	1
Fan, Fen	1
Goodman, Joshua T.	1
Guo, Rui	1
Hammond, Shelby	1
He, Yong	1
Holland, Paul	1
Huggins-Manley, Anne Corinne	1
Kelecioglu, Hülya	1
Lee, Won-Chan	1
Liu, Jinghua	1
Moses, Tim	1
Sinharay, Sandip	1
Skaggs, Gary	1
Zheng, Yi	1
Öztürk-Gübes, Nese	1
More ▼