ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	16

Descriptor

Equated Scores	25
Error of Measurement	25
Simulation	25
Test Items	12
Item Response Theory	9
Statistical Analysis	9
Difficulty Level	8
Sample Size	7
Sampling	7
Comparative Analysis	6
Test Format	6
Evaluation Methods	4
Test Construction	4
Computation	3
Correlation	3
Estimation (Mathematics)	3
Psychometrics	3
Statistical Bias	3
Testing Programs	3
Ability	2
Ability Grouping	2
Accuracy	2
Item Analysis	2
Item Banks	2
Latent Trait Theory	2
More ▼

Source

ETS Research Report Series	5
Journal of Educational and…	3
Applied Measurement in…	2
Journal of Educational…	2
ProQuest LLC	2
Applied Psychological…	1
International Journal of…	1
Measurement:…	1
Practical Assessment,…	1
Psychometrika	1
Research Matters	1
More ▼

Publication Type

Journal Articles	18
Reports - Research	17
Reports - Evaluative	5
Dissertations/Theses -…	2
Speeches/Meeting Papers	2
Numerical/Quantitative Data	1
Reports - Descriptive	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…	1
Armed Services Vocational…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 25 results Save | Export

Practical Considerations in Choosing an Anchor Test Form for Equating under the Random Groups Design

Peer reviewed

Direct link

Cui, Zhongmin; He, Yong – Measurement: Interdisciplinary Research and Perspectives, 2023

Careful considerations are necessary when there is a need to choose an anchor test form from a list of old test forms for equating under the random groups design. The choice of the anchor form potentially affects the accuracy of equated scores on new test forms. Few guidelines, however, can be found in the literature on choosing the anchor form.…

Descriptors: Test Format, Equated Scores, Best Practices, Test Construction

Impacts of Differences in Group Abilities and Anchor Test Features on Three Non-IRT Test Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024

The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…

Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests

Equating with Small and Unbalanced Samples

Peer reviewed

Direct link

Goodman, Joshua T.; Dallas, Andrew D.; Fan, Fen – Applied Measurement in Education, 2020

Recent research has suggested that re-setting the standard for each administration of a small sample examination, in addition to the high cost, does not adequately maintain similar performance expectations year after year. Small-sample equating methods have shown promise with samples between 20 and 30. For groups that have fewer than 20 students,…

Descriptors: Equated Scores, Sample Size, Sampling, Weighted Scores

Comparing Small-Sample Equating with Angoff Judgement for Linking Cut-Scores on Two Tests

Download full text

Bramley, Tom – Research Matters, 2020

The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…

Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy

Asymptotic Standard Errors of Observed-Score Equating with Polytomous IRT Models

Peer reviewed

Direct link

Andersson, Björn – Journal of Educational Measurement, 2016

In observed-score equipercentile equating, the goal is to make scores on two scales or tests measuring the same construct comparable by matching the percentiles of the respective score distributions. If the tests consist of different items with multiple categories for each item, a suitable model for the responses is a polytomous item response…

Descriptors: Equated Scores, Item Response Theory, Error of Measurement, Tests

The Effect of Anchor Test Construction on Scale Drift

Peer reviewed

Direct link

Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014

In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…

Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory

Exploring Alternative Test Form Linking Designs with Modified Equating Sample Size and Anchor Test Length. Research Report. ETS RR-13-02

Peer reviewed
PDF on ERIC

Download full text

Wang, Lin; Qian, Jiahe; Lee, Yi-Hsuan – ETS Research Report Series, 2013

The purpose of this study was to evaluate the combined effects of reduced equating sample size and shortened anchor test length on item response theory (IRT)-based linking and equating results. Data from two independent operational forms of a large-scale testing program were used to establish the baseline results for evaluating the results from…

Descriptors: Test Construction, Item Response Theory, Testing Programs, Simulation

Mixed-Format Test Score Equating: Effect of Item-Type Multidimensionality, Length and Composition of Common-Item Set, and Group Ability Difference

Direct link

Wang, Wei – ProQuest LLC, 2013

Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…

Descriptors: Equated Scores, Test Format, Test Items, Test Length

Observed-Score Equating with a Heterogeneous Target Population

Peer reviewed

Direct link

Duong, Minh Q.; von Davier, Alina A. – International Journal of Testing, 2012

Test equating is a statistical procedure for adjusting for test form differences in difficulty in a standardized assessment. Equating results are supposed to hold for a specified target population (Kolen & Brennan, 2004; von Davier, Holland, & Thayer, 2004) and to be (relatively) independent of the subpopulations from the target population (see…

Descriptors: Ability Grouping, Difficulty Level, Psychometrics, Statistical Analysis

Assessing First- and Second-Order Equity for the Common-Item Nonequivalent Groups Design Using Multidimensional IRT

Direct link

Andrews, Benjamin James – ProQuest LLC, 2011

The equity properties can be used to assess the quality of an equating. The degree to which expected scores conditional on ability are similar between test forms is referred to as first-order equity. Second-order equity is the degree to which conditional standard errors of measurement are similar between test forms after equating. The purpose of…

Descriptors: Test Format, Advanced Placement, Simulation, True Scores

Standard Errors of Equating for the Percentile Rank-Based Equipercentile Equating with Log-Linear Presmoothing

Peer reviewed

Direct link

Wang, Tianyou – Journal of Educational and Behavioral Statistics, 2009

Holland and colleagues derived a formula for analytical standard error of equating using the delta-method for the kernel equating method. Extending their derivation, this article derives an analytical standard error of equating procedure for the conventional percentile rank-based equipercentile equating with log-linear smoothing. This procedure is…

Descriptors: Error of Measurement, Equated Scores, Statistical Analysis, Statistical Inference

Comparison of the Effects of Discrete Anchor Items and Assage-Based Anchor Items on Observed-Score Equating Results. Research Report. ETS RR-09-44

Peer reviewed
PDF on ERIC

Download full text

Zu, Jiyun; Liu, Jinghua – ETS Research Report Series, 2009

Equating of tests composed of both discrete and passage-based items using the nonequivalent groups with anchor test (NEAT) design is popular in practice. This study investigated the impact of discrete anchor items and passage-based anchor items on observed score equating via simulation. Results suggested that an anchor with a larger proportion of…

Descriptors: Comparative Analysis, Equated Scores, Test Items, Simulation

Using the Kernel Method of Test Equating for Estimating the Standard Errors of Population Invariance Measures

Peer reviewed

Direct link

Moses, Tim – Journal of Educational and Behavioral Statistics, 2008

Equating functions are supposed to be population invariant, meaning that the choice of subpopulation used to compute the equating function should not matter. The extent to which equating functions are population invariant is typically assessed in terms of practical difference criteria that do not account for equating functions' sampling…

Descriptors: Equated Scores, Error of Measurement, Sampling, Evaluation Methods

Kernel and Traditional Equipercentile Equating with Degrees of Presmoothing. Research Report. ETS RR-07-15

Peer reviewed
PDF on ERIC

Download full text

Moses, Tim; Holland, Paul – ETS Research Report Series, 2007

The purpose of this study was to empirically evaluate the impact of loglinear presmoothing accuracy on equating bias and variability across chained and post-stratification equating methods, kernel and percentile-rank continuization methods, and sample sizes. The results of evaluating presmoothing on equating accuracy generally agreed with those of…

Descriptors: Equated Scores, Statistical Analysis, Accuracy, Sample Size

The Effectiveness of Circular Equating as a Criterion for Evaluating Equating.

Peer reviewed

Wang, Tianyou; Hanson, Bradley A.; Harris, Deborah J. – Applied Psychological Measurement, 2000

Studied whether circular equating could provide an adequate measure of various types of equating error when applied to different equating methods under different equating designs. Analyses and simluations show that circular equating is generally invalid as a criterion to evaluate the adequacy of equating. (SLD)

Descriptors: Criteria, Equated Scores, Error of Measurement, Evaluation Methods

Previous Page | Next Page »

Pages: 1 | 2

Holland, Paul	2
Li, Yuan H.	2
Moses, Tim	2
Wang, Tianyou	2
von Davier, Alina A.	2
Andersson, Björn	1
Andrews, Benjamin James	1
Antal, Judit	1
Bramley, Tom	1
Casabianca, Jodi	1
Cui, Zhongmin	1
Curry, Allen R.	1
Dallas, Andrew D.	1
Duong, Minh Q.	1
Fairbank, Benjamin A., Jr.	1
Fan, Fen	1
Goodman, Joshua T.	1
Grant, Mary C.	1
Griffith, William D.	1
Hanson, Bradley A.	1
Harris, Deborah J.	1
He, Yong	1
Holland, Paul W.	1
Inga Laukaityte	1
More ▼