ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	13

Descriptor

Equated Scores	19
Sample Size	19
Sampling	19
Test Items	9
Error of Measurement	7
Accuracy	5
Comparative Analysis	5
Item Response Theory	5
Simulation	5
Computation	3
Cutting Scores	3
Difficulty Level	3
Evaluation Methods	3
Latent Trait Theory	3
Test Construction	3
College Entrance Examinations	2
Item Analysis	2
Mathematical Models	2
Mathematics Tests	2
Measurement	2
Methods	2
Raw Scores	2
Reliability	2
Research Design	2
Scaling	2
More ▼

Source

Applied Measurement in…	4
ETS Research Report Series	3
Journal of Educational…	2
ACT, Inc.	1
Educational Measurement:…	1
International Journal of…	1
ProQuest LLC	1
Research Matters	1

Publication Type

Reports - Research	15
Journal Articles	12
Numerical/Quantitative Data	2
Reports - Evaluative	2
Speeches/Meeting Papers	2
Dissertations/Theses -…	1
Reports - Descriptive	1

Education Level

Higher Education	2
Grade 8	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
Secondary Education	1

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…	2
General Educational…	1
National Assessment of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 19 results Save | Export

Evaluating Population Invariance of Test Equating during the COVID-19 Pandemic

Peer reviewed

Direct link

Li, Dongmei; Kapoor, Shalini – Educational Measurement: Issues and Practice, 2022

Population invariance is a desirable property of test equating which might not hold when significant changes occur in the test population, such as those brought about by the COVID-19 pandemic. This research aims to investigate whether equating functions are reasonably invariant when the test population is impacted by the pandemic. Based on…

Descriptors: Test Items, Equated Scores, COVID-19, Pandemics

Equating with Small and Unbalanced Samples

Peer reviewed

Direct link

Goodman, Joshua T.; Dallas, Andrew D.; Fan, Fen – Applied Measurement in Education, 2020

Recent research has suggested that re-setting the standard for each administration of a small sample examination, in addition to the high cost, does not adequately maintain similar performance expectations year after year. Small-sample equating methods have shown promise with samples between 20 and 30. For groups that have fewer than 20 students,…

Descriptors: Equated Scores, Sample Size, Sampling, Weighted Scores

Comparing Small-Sample Equating with Angoff Judgement for Linking Cut-Scores on Two Tests

Download full text

Bramley, Tom – Research Matters, 2020

The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…

Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy

Investigating Repeater Effects on Small Sample Equating: Include or Exclude?

Peer reviewed

Direct link

Diao, Hongyu; Keller, Lisa – Applied Measurement in Education, 2020

Examinees who attempt the same test multiple times are often referred to as "repeaters." Previous studies suggested that repeaters should be excluded from the total sample before equating because repeater groups are distinguishable from non-repeater groups. In addition, repeaters might memorize anchor items, causing item drift under a…

Descriptors: Licensing Examinations (Professions), College Entrance Examinations, Repetition, Testing Problems

A General Linear Method for Equating with Small Samples

Peer reviewed

Direct link

Albano, Anthony D. – Journal of Educational Measurement, 2015

Research on equating with small samples has shown that methods with stronger assumptions and fewer statistical estimates can lead to decreased error in the estimated equating function. This article introduces a new approach to linear observed-score equating, one which provides flexible control over how form difficulty is assumed versus estimated…

Descriptors: Equated Scores, Sample Size, Sampling, Statistical Inference

Use of Jackknifing to Evaluate Effects of Anchor Item Selection on Equating with the Nonequivalent Groups with Anchor Test (NEAT) Design. Research Report. ETS RR-15-10

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Haberman, Shelby; Guo, Hongwen; Liu, Jinghua – ETS Research Report Series, 2015

In this study, we apply jackknifing to anchor items to evaluate the impact of anchor selection on equating stability. In an ideal world, the choice of anchor items should have little impact on equating results. When this ideal does not correspond to reality, selection of anchor items can strongly influence equating results. This influence does not…

Descriptors: Test Construction, Equated Scores, Test Items, Sampling

Selection of Common Items as an Unrecognized Source of Variability in Test Equating: A Bootstrap Approximation Assuming Random Sampling of Common Items

Peer reviewed

Direct link

Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014

The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…

Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference

A Comparison of Four Linear Equating Methods for the Common-Item Nonequivalent Groups Design Using Simulation Methods. ACT Research Report Series, 2013 (2)

Download full text

Topczewski, Anna; Cui, Zhongmin; Woodruff, David; Chen, Hanwei; Fang, Yu – ACT, Inc., 2013

This paper investigates four methods of linear equating under the common item nonequivalent groups design. Three of the methods are well known: Tucker, Angoff-Levine, and Congeneric-Levine. A fourth method is presented as a variant of the Congeneric-Levine method. Using simulation data generated from the three-parameter logistic IRT model we…

Descriptors: Comparative Analysis, Equated Scores, Methods, Simulation

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

An Empirical Comparison of Methods for Equating with Randomly Equivalent Groups of 50 to 400 Test Takers. Research Report. ETS RR-10-05

Peer reviewed
PDF on ERIC

Download full text

Livingston, Samuel A.; Kim, Sooyeon – ETS Research Report Series, 2010

A series of resampling studies investigated the accuracy of equating by four different methods in a random groups equating design with samples of 400, 200, 100, and 50 test takers taking each form. Six pairs of forms were constructed. Each pair was constructed by assigning items from an existing test taken by 9,000 or more test takers. The…

Descriptors: Equated Scores, Accuracy, Sample Size, Sampling

Conditions Affecting the Accuracy of Classical Equating Methods for Small Samples under the NEAT Design: A Simulation Study

Direct link

Sunnassee, Devdass – ProQuest LLC, 2011

Small sample equating remains a largely unexplored area of research. This study attempts to fill in some of the research gaps via a large-scale, IRT-based simulation study that evaluates the performance of seven small-sample equating methods under various test characteristic and sampling conditions. The equating methods considered are typically…

Descriptors: Test Length, Test Format, Sample Size, Simulation

Impact of Inclusion or Exclusion of Repeaters on Test Equating

Peer reviewed

Direct link

Puhan, Gautam – International Journal of Testing, 2011

This study examined the effect of including or excluding repeaters on the equating process and results. New forms of two tests were equated to their respective old forms using either all examinees or only the first timer examinees in the new form sample. Results showed that for both tests used in this study, including or excluding repeaters in the…

Descriptors: Equated Scores, Educational Testing, Student Evaluation, Sample Size

Methods of Linking with Small Samples in a Common-Item Design: An Empirical Comparison. Research Report. ETS RR-09-38

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2009

A series of resampling studies was conducted to compare the accuracy of equating in a common item design using four different methods: chained equipercentile equating of smoothed distributions, chained linear equating, chained mean equating, and the circle-arc method. Four operational test forms, each containing more than 100 items, were used for…

Descriptors: Sampling, Sample Size, Accuracy, Test Items

Sample Selection Effect on AP Multiple-Choice Score to Composite Score Scaling.

Download full text

Yang, Wen-Ling; Dorans, Neil J.; Tateneni, Krishna – 2002

Scores on the multiple-choice sections of alternate forms are equated through anchor-test equating for the Advanced Placement Program (AP) examinations. There is no linkage of free-response sections since different free-response items are given yearly. However, the free-response and multiple-choice sections are combined to produce a composite.…

Descriptors: Cutting Scores, Equated Scores, Multiple Choice Tests, Sample Size

Comparison of Four Procedures for Equating the Tests of General Educational Development.

Peer reviewed

Kolen, Michael J.; Whitney, Douglas R. – Journal of Educational Measurement, 1982

The adequacy of equipercentile, linear, one-parameter (Rasch), and three-parameter logistic item-response theory procedures for equating 12 forms of five tests of general educational development were compared. Results indicated the equating method adequacy depends on a variety of factors such as test characteristics, equating design, and sample…

Descriptors: Achievement Tests, Comparative Analysis, Equated Scores, Equivalency Tests

Previous Page | Next Page »

Pages: 1 | 2

Livingston, Samuel A.	3
Kim, Sooyeon	2
Kolen, Michael J.	2
Albano, Anthony D.	1
Bay, Luz	1
Bramley, Tom	1
Chen, Hanwei	1
Chen, Lee	1
Cook, Linda L.	1
Cui, Zhongmin	1
Dallas, Andrew D.	1
Diao, Hongyu	1
Dorans, Neil J.	1
Fan, Fen	1
Fang, Yu	1
Goodman, Joshua T.	1
Guo, Hongwen	1
Haberman, Shelby	1
Haertel, Edward H.	1
Hanson, Bradley A.	1
Happel, Jay	1
Jensen, Harald E.	1
Kapoor, Shalini	1
Keller, Lisa	1
Li, Dongmei	1
More ▼