Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 7 |
Descriptor
Equated Scores | 17 |
Sampling | 17 |
Simulation | 17 |
Error of Measurement | 7 |
Item Response Theory | 6 |
Sample Size | 5 |
Statistical Analysis | 5 |
Comparative Analysis | 4 |
Evaluation Methods | 4 |
Test Format | 4 |
Difficulty Level | 3 |
More ▼ |
Source
Applied Measurement in… | 2 |
Applied Psychological… | 2 |
ACT, Inc. | 1 |
International Journal of… | 1 |
Journal of Educational… | 1 |
Journal of Educational and… | 1 |
Journal of Experimental… | 1 |
ProQuest LLC | 1 |
Research Matters | 1 |
Author
Eignor, Daniel R. | 2 |
Baker, Frank B. | 1 |
Bramley, Tom | 1 |
Chen, Hanwei | 1 |
Cui, Zhongmin | 1 |
Dallas, Andrew D. | 1 |
Dorans, Neil J. | 1 |
Duong, Minh Q. | 1 |
Fairbank, Benjamin A., Jr. | 1 |
Fan, Fen | 1 |
Fang, Yu | 1 |
More ▼ |
Publication Type
Reports - Research | 10 |
Journal Articles | 9 |
Reports - Evaluative | 5 |
Dissertations/Theses -… | 1 |
Information Analyses | 1 |
Numerical/Quantitative Data | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Audience
Researchers | 1 |
Location
Laws, Policies, & Programs
Assessments and Surveys
Armed Services Vocational… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Goodman, Joshua T.; Dallas, Andrew D.; Fan, Fen – Applied Measurement in Education, 2020
Recent research has suggested that re-setting the standard for each administration of a small sample examination, in addition to the high cost, does not adequately maintain similar performance expectations year after year. Small-sample equating methods have shown promise with samples between 20 and 30. For groups that have fewer than 20 students,…
Descriptors: Equated Scores, Sample Size, Sampling, Weighted Scores
Bramley, Tom – Research Matters, 2020
The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…
Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy
Topczewski, Anna; Cui, Zhongmin; Woodruff, David; Chen, Hanwei; Fang, Yu – ACT, Inc., 2013
This paper investigates four methods of linear equating under the common item nonequivalent groups design. Three of the methods are well known: Tucker, Angoff-Levine, and Congeneric-Levine. A fourth method is presented as a variant of the Congeneric-Levine method. Using simulation data generated from the three-parameter logistic IRT model we…
Descriptors: Comparative Analysis, Equated Scores, Methods, Simulation
Duong, Minh Q.; von Davier, Alina A. – International Journal of Testing, 2012
Test equating is a statistical procedure for adjusting for test form differences in difficulty in a standardized assessment. Equating results are supposed to hold for a specified target population (Kolen & Brennan, 2004; von Davier, Holland, & Thayer, 2004) and to be (relatively) independent of the subpopulations from the target population (see…
Descriptors: Ability Grouping, Difficulty Level, Psychometrics, Statistical Analysis
Sunnassee, Devdass – ProQuest LLC, 2011
Small sample equating remains a largely unexplored area of research. This study attempts to fill in some of the research gaps via a large-scale, IRT-based simulation study that evaluates the performance of seven small-sample equating methods under various test characteristic and sampling conditions. The equating methods considered are typically…
Descriptors: Test Length, Test Format, Sample Size, Simulation
Moses, Tim – Journal of Educational and Behavioral Statistics, 2008
Equating functions are supposed to be population invariant, meaning that the choice of subpopulation used to compute the equating function should not matter. The extent to which equating functions are population invariant is typically assessed in terms of practical difference criteria that do not account for equating functions' sampling…
Descriptors: Equated Scores, Error of Measurement, Sampling, Evaluation Methods
Dorans, Neil J.; Liu, Jinghua; Hammond, Shelby – Applied Psychological Measurement, 2008
This exploratory study was built on research spanning three decades. Petersen, Marco, and Stewart (1982) conducted a major empirical investigation of the efficacy of different equating methods. The studies reported in Dorans (1990) examined how different equating methods performed across samples selected in different ways. Recent population…
Descriptors: Test Format, Equated Scores, Sampling, Evaluation Methods

Baker, Frank B. – Applied Psychological Measurement, 1997
Examined the sampling distributions of equating coefficients produced by the characteristic curve method for tests using graded and nominal response scoring using simulated data. For both models and across all three equating situations, the sampling distributions were generally bell-shaped and peaked, and occasionally had a small degree of…
Descriptors: Equated Scores, Sampling, Simulation, Statistical Distributions
Wang, Tianyou; Hanson, Bradley A.; Harris, Deborah J. – 1998
Equating a test form to itself through a chain of equatings, commonly referred to as circular equating, has been widely used as a criterion to evaluate the adequacy of equating. This paper uses both analytical methods and simulation methods to show that this criterion is in general invalid in serving this purpose. For the random groups design done…
Descriptors: Equated Scores, Evaluation Methods, Heuristics, Sampling

Eignor, Daniel R.; And Others – Applied Measurement in Education, 1990
Two independent replications of a sequence of simulations were conducted to aid in the diagnosis and interpretation of equating differences found between representative (random) and matched (nonrandom) samples for three commonly used conventional observed-score equating procedures and one item-response-theory-based equating procedure. (SLD)
Descriptors: Equated Scores, Item Response Theory, Sampling, Simulation
Eignor, Daniel R.; And Others – 1995
Two recent simulation studies were conducted to aid in the diagnosis and interpretation of equating differences found between random and matched (nonrandom) samples for four commonly used equating procedures: (1) Tucker; (2) Levine equally reliable; (3) Chained equipercentile observed-score; and (4) three-parameter, item response theory true-score…
Descriptors: Criteria, Equated Scores, Item Response Theory, Raw Scores
Kolen, Michael J. – 1984
Large sample standard errors for the Tucker method of linear equating under the common item nonrandom groups design are derived under normality assumptions as well as under less restrictive assumptions. Standard errors of Tucker equating are estimated using the bootstrap method described by Efron. The results from different methods are compared…
Descriptors: Certification, Comparative Analysis, Equated Scores, Error of Measurement
Fairbank, Benjamin A., Jr. – 1985
The effectiveness of 19 methods of smoothing was investigated as those methods apply to the equipercentile method of test equating. Seven methods involved smoothing the score distribution before the tests were equated (presmoothing). Seven involved smoothing the resultant points after the equating (postsmoothing). Five methods involved combining…
Descriptors: Adults, Equated Scores, Equations (Mathematics), Error of Measurement
Ree, Malcom James; Jensen, Harald E. – 1980
By means of computer simulation of test responses, the reliability of item analysis data and the accuracy of equating were examined for hypothetical samples of 250, 500, 1000, and 2000 subjects for two tests with 20 equating items plus 60 additional items on the same scale. Birnbaum's three-parameter logistic model was used for the simulation. The…
Descriptors: Computer Assisted Testing, Equated Scores, Error of Measurement, Item Analysis

Gustafsson, Jan-Eric – Journal of Educational Measurement, 1979
Computer generated data are used to show that Slinde and Linn's criticism of the usefulness of the Rasch model for equating (EJ 189 585) may have been the result of an artifact produced by the manner in which the samples were chosen in their study. (CTM)
Descriptors: Achievement Tests, Bias, College Entrance Examinations, Equated Scores
Previous Page | Next Page ยป
Pages: 1 | 2