Publication Date
In 2025 | 4 |
Since 2024 | 9 |
Since 2021 (last 5 years) | 58 |
Since 2016 (last 10 years) | 147 |
Since 2006 (last 20 years) | 496 |
Descriptor
Source
Author
Bianchini, John C. | 35 |
von Davier, Alina A. | 34 |
Dorans, Neil J. | 33 |
Kolen, Michael J. | 31 |
Loret, Peter G. | 31 |
Kim, Sooyeon | 26 |
Moses, Tim | 24 |
Livingston, Samuel A. | 22 |
Holland, Paul W. | 20 |
Puhan, Gautam | 20 |
Liu, Jinghua | 19 |
More ▼ |
Publication Type
Education Level
Location
Canada | 9 |
Australia | 8 |
Florida | 8 |
United Kingdom (England) | 8 |
Netherlands | 7 |
New York | 7 |
United States | 7 |
Israel | 6 |
Turkey | 6 |
United Kingdom | 6 |
California | 5 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 12 |
No Child Left Behind Act 2001 | 5 |
Education Consolidation… | 3 |
Hawkins Stafford Act 1988 | 1 |
Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |

Huynh, Huynh; Ferrara, Steven – Journal of Educational Measurement, 1994
Equal percentile (EP) and partial credit (PC) equatings for raw scores from performance-based assessments with free-response items are compared through the use of data from the Maryland School Performance Assessment Program. Results suggest that EP and PC methods do not give equivalent results when distributions are markedly skewed. (SLD)
Descriptors: Comparative Analysis, Equated Scores, Mathematics Tests, Performance Based Assessment

Baker, Frank B.; Al-Karni, Ali – Journal of Educational Measurement, 1991
Two methods of computing test equating coefficients under item response theory by the following authors are compared: (1) B. H. Loyd and H. D. Hoover (1980); and (2) M. L. Stocking and F. M. Lord (1983). Conditions under which the method of Stocking and Lord is preferable are described. (SLD)
Descriptors: Ability, College Entrance Examinations, Comparative Analysis, Equated Scores
Wang, Xiang-bo; And Others – 1993
An increasingly popular test format allows examinees to choose the items they will answer from among a larger set. When examinee choice is allowed fairness requires that the different test forms thus formed be equated for their possible differential difficulty. For this equating to be possible it is necessary to know how well examinees would have…
Descriptors: Adaptive Testing, Advanced Placement, Difficulty Level, Equated Scores
Kim, Seock-Ho; Cohen, Allan S. – 1996
Applications of item response theory to practical testing problems including equating, differential item functioning, and computerized adaptive testing, require that item parameter estimates be placed onto a common metric. In this study, three methods for developing a common metric under item response theory are compared: (1) linking separate…
Descriptors: Adaptive Testing, Comparative Analysis, Computer Assisted Testing, Difficulty Level
O'Neill, Thomas R.; Lunz, Mary E. – 1996
To generalize test results beyond the particular test administration, an examinee's ability estimate must be independent of the particular items attempted, and the item difficulty calibrations must be independent of the particular sample of people attempting the items. This stability is a key concept of the Rasch model, a latent trait model of…
Descriptors: Ability, Benchmarking, Comparative Analysis, Difficulty Level
Livingston, Samuel A. – 1992
This study investigated the extent to which log-linear smoothing could improve the accuracy of common-item equating by the chained equipercentile method in small samples of examinees. Examinee response data from a 100-item test (the Advanced Placement Examination in United States History) were used to create two overlapping forms of 58 items each,…
Descriptors: Advanced Placement Programs, College Entrance Examinations, Equated Scores, High School Students
Wang, Tianyou; Kolen, Michael J. – 1994
In this paper a quadratic curve equating method for different test forms under a random-group data-collection design is proposed. Procedures for implementing this method and related issues are described and discussed. The quadratic-curve method was evaluated with real test data (from two 30-item subtests for a professional licensure examination…
Descriptors: Comparative Analysis, Data Collection, Equated Scores, Goodness of Fit
Baghi, Heibatollah – 1990
The Maryland Functional Testing Program (MFTP) uses the Rasch model as the statistical framework for the analysis of test items and scores. This paper is designed to assist the reader in developing an understanding of the fit statistics in the Rasch model. Background materials on application of the Rasch model in statistical analysis of the MFTP…
Descriptors: Computer Assisted Testing, Computer Software, Equated Scores, Error of Measurement
Kubiak, Anna T.; Cowell, William R. – 1990
A procedure used to average several Mantel-Haenszel delta difference values for an item is described and evaluated. The differential item functioning (DIF) procedure used by the Educational Testing Service (ETS) is based on the Mantel-Haenszel statistical technique for studying matched groups. It is standard procedure at ETS to analyze test items…
Descriptors: Difficulty Level, Elementary Secondary Education, Equated Scores, Item Bias
Hills, John R.; And Others – 1987
The 1986 scores from the Statewide Student Assessment Test-II, a minimum-competency test required for high school graduation in Florida, were placed on the scale of the 1984 scores from that test using five different equating procedures: (1) linear method; (2) Rasch model; (3) three-parameter item response theory (IRT)--concurrent method; (4)…
Descriptors: Comparative Testing, Cost Effectiveness, Equated Scores, Feasibility Studies
Skaggs, Gary; Lissitz, Robert W. – 1985
This study examined how four commonly used test equating procedures (linear, equipercentile, Rasch Model, and three-parameter) would respond to situations in which the properties or the two tests being equated were different. Data for two tests plus an external anchor test were generated from a three parameter model in which mean test differences…
Descriptors: Computer Simulation, Equated Scores, Error of Measurement, Goodness of Fit
Cope, Ronald T. – 1986
Comparisons were made of three Angoff Design V linear equating methods (two forms equated to a common test, two forms predicted by a common test, or two forms used to predict a common test) and Tucker's and R. Levine's linear methods, under common item linear equating with non-equivalent populations. Forms of a professional certification test…
Descriptors: Certification, Comparative Analysis, Equated Scores, Higher Education
Mislevy, Robert J.; Bock, R. Darrell – 1984
New legislation in 1972 shifted the emphasis of the California Assessment Program (CAP) from traditional every pupil achievement testing to a more efficient multiple-matrix testing design, under which a broad spectrum of skills could be surveyed without undue expenditure of educational resources. Scale score reporting was introduced to the grade 6…
Descriptors: Achievement Tests, Basic Skills, Equated Scores, Grade 6
Storlie, Theodore R.; And Others – 1979
Three local norm methods for deriving Elementary Secondary Education Act Title I normal curve equivalent (NCE) gains are compared to each other and to the usual Model A1 which uses national norms. The 1977-78 Title I evaluation data for Chicago students ages 9-13 were used to estimate gains. Reading and total math scores from the 1971 Iowa Tests…
Descriptors: Elementary Education, Elementary School Mathematics, Equated Scores, Local Norms

Marco, Gary L.; And Others – 1979
Data from the verbal portion of the College Entrance Examination Board Scholastic Aptitude Tests were used in an experimental test of the accuracy of equating for a variety of models in three categories: linear equating, equipercentile equating, and item characteristic curve equating. The models were tested for both mean squared error and bias.…
Descriptors: Aptitude Tests, Equated Scores, Error of Measurement, High Schools