Publication Date
In 2025 | 4 |
Since 2024 | 9 |
Since 2021 (last 5 years) | 58 |
Since 2016 (last 10 years) | 147 |
Since 2006 (last 20 years) | 496 |
Descriptor
Source
Author
Bianchini, John C. | 35 |
von Davier, Alina A. | 34 |
Dorans, Neil J. | 33 |
Kolen, Michael J. | 31 |
Loret, Peter G. | 31 |
Kim, Sooyeon | 26 |
Moses, Tim | 24 |
Livingston, Samuel A. | 22 |
Holland, Paul W. | 20 |
Puhan, Gautam | 20 |
Liu, Jinghua | 19 |
More ▼ |
Publication Type
Education Level
Location
Canada | 9 |
Australia | 8 |
Florida | 8 |
United Kingdom (England) | 8 |
Netherlands | 7 |
New York | 7 |
United States | 7 |
Israel | 6 |
Turkey | 6 |
United Kingdom | 6 |
California | 5 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 12 |
No Child Left Behind Act 2001 | 5 |
Education Consolidation… | 3 |
Hawkins Stafford Act 1988 | 1 |
Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |

Ogasawara, Haruhiko – Applied Psychological Measurement, 2001
Derived asymptotic standard errors (SEs) of item response theory equating coefficient estimates using response functions or their transformations. Presents two variations of the item and test response function methods and SEs of their parameter estimates that use logit transformation of the item response functions. Numerical examples show that the…
Descriptors: Equated Scores, Error of Measurement, Item Response Theory

Zeng, Lingjia; Kolen, Michael J. – Applied Psychological Measurement, 1995
An alternative approach for item response theory observed-score equating is described. Number-correct score distributions are integrated over theoretical or empirical distributions of examinees' estimated trait levels. Comparisons with alternative methods are made. The method can be implemented without the need to estimate trait level for…
Descriptors: Equated Scores, Estimation (Mathematics), Item Response Theory, Scoring
Moses, Tim; Kim, Sooyeon – ETS Research Report Series, 2007
This study evaluated the impact of unequal reliability on test equating methods in the nonequivalent groups with anchor test (NEAT) design. Classical true score-based models were compared in terms of their assumptions about how reliability impacts test scores. These models were related to treatment of population ability differences by different…
Descriptors: Reliability, Equated Scores, Test Items, Statistical Analysis
Liao, Chi-Wen; Livingston, Samuel A. – ETS Research Report Series, 2008
Randomly equivalent forms (REF) of tests in listening and reading for nonnative speakers of English were created by stratified random assignment of items to forms, stratifying on item content and predicted difficulty. The study included 50 replications of the procedure for each test. Each replication generated 2 REFs. The equivalence of those 2…
Descriptors: Equated Scores, Item Analysis, Test Items, Difficulty Level
von Davier, Alina A.; Holland, Paul W.; Livingston, Samuel A.; Casabianca, Jodi; Grant, Mary C.; Martin, Kathleen – ETS Research Report Series, 2006
This study examines how closely the kernel equating (KE) method (von Davier, Holland, & Thayer, 2004a) approximates the results of other observed-score equating methods--equipercentile and linear equatings. The study used pseudotests constructed of item responses from a real test to simulate three equating designs: an equivalent groups (EG)…
Descriptors: Equated Scores, Statistical Analysis, Simulation, Tests
Mao, Xia; von Davier, Alina A.; Rupp, Stacie – ETS Research Report Series, 2006
Kernel equating (KE) is a new approach to observed-score equating and is described in detail in von Davier, Holland, and Thayer (2004b). Over the past months, several evaluation studies of KE have been designed and carried out. In this part of the overall evaluation study, we compared the KE method with other equating methods using real data from…
Descriptors: Licensing Examinations (Professions), Teacher Certification, Equated Scores, Statistical Analysis
Hanson, Bradley A.; Feinstein, Zachary S. – 1997
Loglinear and logit models that have been suggested for studying differential item functioning (DIF) are reviewed, and loglinear formulations of the logit models are given. A polynomial loglinear model for assessing DIF is introduced that incorporates scores on the matching variable and item responses. The polynomial loglinear model contains far…
Descriptors: Equated Scores, Item Bias, Scores, Test Construction
Lissitz, Robert W.; Huynh, Huynh – 2003
This paper addresses issues of vertical equating for the Arkansas Comprehensive Testing, Assessment and Accountability Program (ACTAAP) assessments as they relate to school accountability and determination of Adequate Yearly Progress (AYP) as required by the recent federal legislation, the No Child Left Behind Act. The paper first provides a brief…
Descriptors: Accountability, Equated Scores, Scaling, Standards

Kolen, Michael J. – Journal of Educational Measurement, 1981
Two traditional equating schemes and seven item response theory equating schemes were compared. Data used came from the 1978 equating project of the Iowa Tests of Educational Development. The study entailed both the equating of forms of similar difficulty and the equating of levels of differing difficulty. (Arthor/RL)
Descriptors: Comparative Analysis, Equated Scores, High Schools, Latent Trait Theory

Hanson, Bradley A. – Applied Measurement in Education, 1996
Determining whether score distributions differ on two or more test forms administered to samples of examinees from a single population is explored using three statistical tests using loglinear models. Examples are presented of applying tests of distribution differences to decide if equating is needed for alternative forms of a test. (SLD)
Descriptors: Equated Scores, Scoring, Statistical Distributions, Test Format

Davey, Tim; And Others – Applied Psychological Measurement, 1996
Scales defined by most item response theory (IRT) models are truly invariant with respect to certain linear transformations of parameters. The problem is to find the proper transformation to place calibrations on a common scale. This paper explores issues of extending and adapting unidimensional linking procedures to multidimensional IRT models.…
Descriptors: Equated Scores, Item Response Theory, Models, Scaling

DeMars, Christine – Applied Measurement in Education, 2002
Simulated items from two test forms using joint maximum likelihood estimation (JMLE) and marginal maximum likelihood estimation (MML) in the vertical equating situation (using an anchor test) when data were nonrandomly missing. Under MML, when the different ability parameters of students were not taken into account, the item difficulty parameters…
Descriptors: Ability, Equated Scores, Estimation (Mathematics), Maximum Likelihood Statistics

Hanson, Bradley A.; Beguin, Anton A. – Applied Psychological Measurement, 2002
Conducted a simulation study of separate versus concurrent item parameter estimation in common item equating using simulation data from a test with 60 dichotomous items and considering: (1) estimation program; (2) sample size per form; (3) number of common items; and (4) equivalent versus nonequivalent groups. Results are not decisive enough to…
Descriptors: Equated Scores, Estimation (Mathematics), Item Response Theory, Scaling

Zeng, Lingjia; And Others – Applied Psychological Measurement, 1994
A general delta method is described for computing the standard error (SE) of a chain of linear equations. The general delta method derives the SEs directly from the moments of the score distributions obtained in the equating chain. Computer simulations demonstrate the method. (SLD)
Descriptors: Computer Simulation, Equated Scores, Error of Measurement, Statistical Distributions

Zeng, Lingjia – Applied Psychological Measurement, 1995
The effects of different degrees of smoothing on results of equipercentile equating in random groups design using a postsmoothing method based on cubic splines were investigated, and a computer-based procedure was introduced for selecting a desirable degree of smoothing. Results suggest that no particular degree of smoothing was always optimal.…
Descriptors: Computer Simulation, Computer Software, Equated Scores, Research Methodology