Publication Date
| In 2026 | 0 |
| Since 2025 | 7 |
| Since 2022 (last 5 years) | 42 |
| Since 2017 (last 10 years) | 126 |
| Since 2007 (last 20 years) | 479 |
Descriptor
Source
Author
| Bianchini, John C. | 35 |
| von Davier, Alina A. | 34 |
| Dorans, Neil J. | 33 |
| Kolen, Michael J. | 31 |
| Loret, Peter G. | 31 |
| Kim, Sooyeon | 26 |
| Moses, Tim | 24 |
| Livingston, Samuel A. | 22 |
| Holland, Paul W. | 20 |
| Puhan, Gautam | 20 |
| Liu, Jinghua | 19 |
| More ▼ | |
Publication Type
Education Level
Location
| Canada | 9 |
| Australia | 8 |
| Florida | 8 |
| United Kingdom (England) | 8 |
| Netherlands | 7 |
| New York | 7 |
| United States | 7 |
| Israel | 6 |
| Turkey | 6 |
| United Kingdom | 6 |
| California | 5 |
| More ▼ | |
Laws, Policies, & Programs
| Elementary and Secondary… | 12 |
| No Child Left Behind Act 2001 | 5 |
| Education Consolidation… | 3 |
| Hawkins Stafford Act 1988 | 1 |
| Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
Li, Yuan H.; Lissitz, Robert W.; Yang, Yu Nu – 1999
Recent years have seen growing use of tests with mixed item formats, e.g., partly containing dichotomously scored items and partly consisting of polytomously scored items. A matching two test characteristic curves method (CCM) for placing these mixed format items on the same metric is described and evaluated in this paper under a common-item…
Descriptors: Equated Scores, Estimation (Mathematics), Item Response Theory, Test Format
Yang, Wen-Ling; Dorans, Neil J.; Tateneni, Krishna – 2002
Scores on the multiple-choice sections of alternate forms are equated through anchor-test equating for the Advanced Placement Program (AP) examinations. There is no linkage of free-response sections since different free-response items are given yearly. However, the free-response and multiple-choice sections are combined to produce a composite.…
Descriptors: Cutting Scores, Equated Scores, Multiple Choice Tests, Sample Size
Peer reviewedHolmes, Susan E. – Journal of Educational Measurement, 1982
Two tests were created from a standardized reading achievement test and vertically equated using a sample of third and fourth grade students. Based on differences in ability estimates for the same student, the Rasch model did not provide a satisfactory means of vertical equating. (Author/CM)
Descriptors: Achievement Tests, Elementary Education, Equated Scores, Latent Trait Theory
Peer reviewedKagan, Dona M.; Stock, William A. – Journal of Experimental Education, 1980
Graduate Record Examination and Miller Analogies Test scores were equated using linear transformation and regression methods. Standard deviations of regression equivalence scores were consistently smaller than those actually obtained in the sample, whereas standard deviations of linear equivalence scores were the same as those in the sample.…
Descriptors: Correlation, Equated Scores, Graduate Students, Higher Education
Peer reviewedMiller, G. Edward; Beretvas, S. Natasha – Journal of Applied Measurement, 2002
Presents empirically based item selection guidelines for moving the cut score on equated tests consisting of "n" dichotomous items calibrated assuming the Rasch model. Derivations of lemmas that underlie the guidelines are provided as well as a simulated example. (SLD)
Descriptors: Cutting Scores, Equated Scores, Item Response Theory, Selection
Peer reviewedHan, Tianqi; And Others – Applied Measurement in Education, 1997
Stability among equating procedures was studied by comparing item response theory (IRT) true-score equating with IRT observed-score equating, IRT true-score equating with equipercentile equating, and IRT observed-score equating with equipercentile equating. On average, IRT true-score equating more frequently produced more stable conversions. (SLD)
Descriptors: Comparative Analysis, Equated Scores, Item Response Theory, Raw Scores
Peer reviewedLissitz, Robert W.; Huynh, Huynh – Practical Assessment, Research & Evaluation, 2003
Describes the concept of "adequate yearly progress" as defined by the No Child Left Behind Act and discusses some of the psychometric issues it raises. Examines scaling as a means to equate tests when assessing educational gains and recommends vertically moderated standards over vertical equating of state assessments. (SLD)
Descriptors: Accountability, Achievement Gains, Equated Scores, Psychometrics
Peer reviewedEignor, Daniel R.; And Others – Applied Measurement in Education, 1990
Two independent replications of a sequence of simulations were conducted to aid in the diagnosis and interpretation of equating differences found between representative (random) and matched (nonrandom) samples for three commonly used conventional observed-score equating procedures and one item-response-theory-based equating procedure. (SLD)
Descriptors: Equated Scores, Item Response Theory, Sampling, Simulation
Peer reviewedMacCann, Robert G. – Educational and Psychological Measurement, 1989
Levine's equations for random groups and unequally reliable tests can be used to equate two tests through performance on an anchor test. Levine's assumption of a parallelism requirement is not necessary; it is sufficient to assume only that the tests are congeneric, an assumption implicit in linear test equating. (SLD)
Descriptors: Equated Scores, Equations (Mathematics), Latent Trait Theory, Test Reliability
Peer reviewedBaker, Frank B. – Applied Psychological Measurement, 1992
The procedure of M.L. Stocking and F.M. Lord (1983) for computing equating coefficients for tests having dichotomously scored items is extended to the case of graded response items. A system of equations for obtaining the equating coefficients under the graded response model is derived. (SLD)
Descriptors: Equated Scores, Equations (Mathematics), Item Response Theory, Mathematical Models
Peer reviewedBolt, Daniel M. – Applied Measurement in Education, 1999
Examined whether the item response theory (IRT) true-score equating method is more adversely affected by the presence of multidimensionality than two conventional equating methods, linear and equipercentile equating. Results of two simulation studies suggest that the IRT method performs as well as the conventional methods when the correlation…
Descriptors: Correlation, Equated Scores, Item Response Theory, Simulation
Peer reviewedTsai, Tsung-Hsun; Hanson, Bradley A.; Kolen, Michael J.; Forsyth, Robert A. – Applied Measurement in Education, 2001
Compared bootstrap standard errors of five item response theory (IRT) equating methods for the common-item nonequivalent groups design using test results for 1,493 and 1,793 examinees taking a professional certification test. Results suggest that standard errors of equating less than 0.1 standard deviation units could be obtained with any of the…
Descriptors: Equated Scores, Error of Measurement, Item Response Theory, Licensing Examinations (Professions)
Kim, Dong-In; Brennan, Robert; Kolen, Michael – Journal of Educational Measurement, 2005
Four equating methods (3PL true score equating, 3PL observed score equating, beta 4 true score equating, and beta 4 observed score equating) were compared using four equating criteria: first-order equity (FOE), second-order equity (SOE), conditional-mean-squared-error (CMSE) difference, and the equi-percentile equating property. True score…
Descriptors: True Scores, Psychometrics, Equated Scores, Item Response Theory
Yin, Ping; Brennan, Robert L.; Kolen, Michael J. – Applied Psychological Measurement, 2004
The use of educational tests in practical situations often requires that tests that are built to different content and/or statistical specifications be statistically linked. When equating methods are applied to tests that differ in content, difficulty, and/or reliability, the resulting scores cannot be used interchangeably. This study examined…
Descriptors: Indexes, Equated Scores, Standardized Tests, Achievement Tests
Moses, Tim; Kim, Sooyeon – ETS Research Report Series, 2007
This study evaluated the impact of unequal reliability on test equating methods in the nonequivalent groups with anchor test (NEAT) design. Classical true score-based models were compared in terms of their assumptions about how reliability impacts test scores. These models were related to treatment of population ability differences by different…
Descriptors: Reliability, Equated Scores, Test Items, Statistical Analysis

Direct link
