ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	23

Descriptor

Equated Scores	35
Simulation	35
Test Items	35
Item Response Theory	20
Comparative Analysis	14
Difficulty Level	12
Error of Measurement	12
Statistical Analysis	8
Sample Size	7
Test Construction	6
Test Format	6
Evaluation Methods	5
Correlation	4
Item Analysis	4
Models	4
Multiple Choice Tests	4
Psychometrics	4
Test Length	4
Cutting Scores	3
Goodness of Fit	3
Latent Trait Theory	3
Measurement	3
Measurement Techniques	3
Sampling	3
Scores	3
More ▼

Source

ETS Research Report Series	7
Journal of Educational…	7
ProQuest LLC	5
Applied Measurement in…	3
Educational and Psychological…	3
Journal of Applied Measurement	2
College Board	1
Practical Assessment,…	1
Research Matters	1

Publication Type

Journal Articles	24
Reports - Research	22
Reports - Evaluative	6
Dissertations/Theses -…	5
Speeches/Meeting Papers	3
Non-Print Media	1
Numerical/Quantitative Data	1
Reference Materials - General	1
Reports - Descriptive	1

Education Level

Higher Education	2
Postsecondary Education	2
Elementary Education	1
Grade 4	1
Intermediate Grades	1

Audience

Location

Florida

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	2
Advanced Placement…	1
Florida Comprehensive…	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 35 results Save | Export

Impacts of Differences in Group Abilities and Anchor Test Features on Three Non-IRT Test Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024

The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…

Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests

Comparing Small-Sample Equating with Angoff Judgement for Linking Cut-Scores on Two Tests

Download full text

Bramley, Tom – Research Matters, 2020

The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…

Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy

Equating with Miditests Using IRT

Peer reviewed

Direct link

Fitzpatrick, Joseph; Skorupski, William P. – Journal of Educational Measurement, 2016

The equating performance of two internal anchor test structures--miditests and minitests--is studied for four IRT equating methods using simulated data. Originally proposed by Sinharay and Holland, miditests are anchors that have the same mean difficulty as the overall test but less variance in item difficulties. Four popular IRT equating methods…

Descriptors: Difficulty Level, Test Items, Comparative Analysis, Test Construction

A Stepwise Test Characteristic Curve Method to Detect Item Parameter Drift

Peer reviewed

Direct link

Guo, Rui; Zheng, Yi; Chang, Hua-Hua – Journal of Educational Measurement, 2015

An important assumption of item response theory is item parameter invariance. Sometimes, however, item parameters are not invariant across different test administrations due to factors other than sampling error; this phenomenon is termed item parameter drift. Several methods have been developed to detect drifted items. However, most of the…

Descriptors: Item Response Theory, Test Items, Evaluation Methods, Equated Scores

Asymptotic Standard Errors of Observed-Score Equating with Polytomous IRT Models

Peer reviewed

Direct link

Andersson, Björn – Journal of Educational Measurement, 2016

In observed-score equipercentile equating, the goal is to make scores on two scales or tests measuring the same construct comparable by matching the percentiles of the respective score distributions. If the tests consist of different items with multiple categories for each item, a suitable model for the responses is a polytomous item response…

Descriptors: Equated Scores, Item Response Theory, Error of Measurement, Tests

Anchor Selection Strategies for DIF Analysis: Review, Assessment, and New Approaches

Peer reviewed

Direct link

Kopf, Julia; Zeileis, Achim; Strobl, Carolin – Educational and Psychological Measurement, 2015

Differential item functioning (DIF) indicates the violation of the invariance assumption, for instance, in models based on item response theory (IRT). For item-wise DIF analysis using IRT, a common metric for the item parameters of the groups that are to be compared (e.g., for the reference and the focal group) is necessary. In the Rasch model,…

Descriptors: Test Items, Equated Scores, Test Bias, Item Response Theory

Psychometric Consequences of Subpopulation Item Parameter Drift

Peer reviewed

Direct link

Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2017

This study defines subpopulation item parameter drift (SIPD) as a change in item parameters over time that is dependent on subpopulations of examinees, and hypothesizes that the presence of SIPD in anchor items is associated with bias and/or lack of invariance in three psychometric outcomes. Results show that SIPD in anchor items is associated…

Descriptors: Psychometrics, Test Items, Item Response Theory, Hypothesis Testing

Effect of Item Response Theory (IRT) Model Selection on Testlet-Based Test Equating. Research Report. ETS RR-14-19

Peer reviewed
PDF on ERIC

Download full text

Cao, Yi; Lu, Ru; Tao, Wei – ETS Research Report Series, 2014

The local item independence assumption underlying traditional item response theory (IRT) models is often not met for tests composed of testlets. There are 3 major approaches to addressing this issue: (a) ignore the violation and use a dichotomous IRT model (e.g., the 2-parameter logistic [2PL] model), (b) combine the interdependent items to form a…

Descriptors: Item Response Theory, Equated Scores, Test Items, Simulation

The Effect of Anchor Test Construction on Scale Drift

Peer reviewed

Direct link

Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014

In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…

Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory

Robust Scale Transformation Methods in IRT True Score Equating under Common-Item Nonequivalent Groups Design

Direct link

He, Yong – ProQuest LLC, 2013

Common test items play an important role in equating multiple test forms under the common-item nonequivalent groups design. Inconsistent item parameter estimates among common items can lead to large bias in equated scores for IRT true score equating. Current methods extensively focus on detection and elimination of outlying common items, which…

Descriptors: Test Items, Regression (Statistics), Simulation, Comparative Analysis

Examining the Impact of Drifted Polytomous Anchor Items on Test Characteristic Curve (TCC) Linking and IRT True Score Equating. Research Report. ETS RR-12-09

Peer reviewed
PDF on ERIC

Download full text

Li, Yanmei – ETS Research Report Series, 2012

In a common-item (anchor) equating design, the common items should be evaluated for item parameter drift. Drifted items are often removed. For a test that contains mostly dichotomous items and only a small number of polytomous items, removing some drifted polytomous anchor items may result in anchor sets that no longer resemble mini-versions of…

Descriptors: Scores, Item Response Theory, Equated Scores, Simulation

Exploring Alternative Test Form Linking Designs with Modified Equating Sample Size and Anchor Test Length. Research Report. ETS RR-13-02

Peer reviewed
PDF on ERIC

Download full text

Wang, Lin; Qian, Jiahe; Lee, Yi-Hsuan – ETS Research Report Series, 2013

The purpose of this study was to evaluate the combined effects of reduced equating sample size and shortened anchor test length on item response theory (IRT)-based linking and equating results. Data from two independent operational forms of a large-scale testing program were used to establish the baseline results for evaluating the results from…

Descriptors: Test Construction, Item Response Theory, Testing Programs, Simulation

Mixed-Format Test Score Equating: Effect of Item-Type Multidimensionality, Length and Composition of Common-Item Set, and Group Ability Difference

Direct link

Wang, Wei – ProQuest LLC, 2013

Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…

Descriptors: Equated Scores, Test Format, Test Items, Test Length

The Effects of Anchor Length, Test Difficulty, Population Ability Differences, Mixture of Populations and Sample Size on the Psychometric Properties of Levine Observed Score Linear Equating Method for Different Assumptions

Direct link

Carvajal-Espinoza, Jorge E. – ProQuest LLC, 2011

The Non-Equivalent groups with Anchor Test equating (NEAT) design is a widely used equating design in large scale testing that involves two groups that do not have to be of equal ability. One group P gets form X and a group of items A and the other group Q gets form Y and the same group of items A. One of the most commonly used equating methods in…

Descriptors: Sample Size, Equated Scores, Psychometrics, Measurement

Conditions Affecting the Accuracy of Classical Equating Methods for Small Samples under the NEAT Design: A Simulation Study

Direct link

Sunnassee, Devdass – ProQuest LLC, 2011

Small sample equating remains a largely unexplored area of research. This study attempts to fill in some of the research gaps via a large-scale, IRT-based simulation study that evaluates the performance of seven small-sample equating methods under various test characteristic and sampling conditions. The equating methods considered are typically…

Descriptors: Test Length, Test Format, Sample Size, Simulation

Previous Page | Next Page »

Pages: 1 | 2 | 3

Antal, Judit	2
Holland, Paul	2
Sinharay, Sandip	2
Andersson, Björn	1
Ban, Jae-Chun	1
Beretvas, S. Natasha	1
Beyerlein, Michael M.	1
Bogan, Evelyn Doody	1
Boldt, R. F.	1
Bolt, Daniel M.	1
Boughton, Keith	1
Bramley, Tom	1
Cao, Yi	1
Carvajal-Espinoza, Jorge E.	1
Casabianca, Jodi	1
Chang, Hua-Hua	1
Curry, Allen R.	1
Davey, T. C.	1
Fitzpatrick, Joseph	1
Grant, Mary C.	1
Guo, Rui	1
He, Yong	1
Hidalgo-Montesinos, Maria…	1
Holland, Paul W.	1
More ▼