Publication Date
| In 2026 | 0 |
| Since 2025 | 7 |
| Since 2022 (last 5 years) | 42 |
| Since 2017 (last 10 years) | 126 |
| Since 2007 (last 20 years) | 479 |
Descriptor
Source
Author
| Bianchini, John C. | 35 |
| von Davier, Alina A. | 34 |
| Dorans, Neil J. | 33 |
| Kolen, Michael J. | 31 |
| Loret, Peter G. | 31 |
| Kim, Sooyeon | 26 |
| Moses, Tim | 24 |
| Livingston, Samuel A. | 22 |
| Holland, Paul W. | 20 |
| Puhan, Gautam | 20 |
| Liu, Jinghua | 19 |
| More ▼ | |
Publication Type
Education Level
Location
| Canada | 9 |
| Australia | 8 |
| Florida | 8 |
| United Kingdom (England) | 8 |
| Netherlands | 7 |
| New York | 7 |
| United States | 7 |
| Israel | 6 |
| Turkey | 6 |
| United Kingdom | 6 |
| California | 5 |
| More ▼ | |
Laws, Policies, & Programs
| Elementary and Secondary… | 12 |
| No Child Left Behind Act 2001 | 5 |
| Education Consolidation… | 3 |
| Hawkins Stafford Act 1988 | 1 |
| Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
Dorans, Neil J. – ETS Research Report Series, 2014
Simulations are widely used. Simulations produce numbers that are deductive demonstrations of what a model says will happen.They produce numerical results that are consistent with the premises of the model used to generate the numbers. These simulated numerical results are not empirical data that address aspects of the world that lies outside the…
Descriptors: Simulation, Equated Scores, Scores, Scientific Methodology
Rutkowski, Leslie; Svetina, Dubravka – Educational and Psychological Measurement, 2014
In the field of international educational surveys, equivalence of achievement scale scores across countries has received substantial attention in the academic literature; however, only a relatively recent emphasis on scale score equivalence in nonachievement education surveys has emerged. Given the current state of research in multiple-group…
Descriptors: International Programs, Educational Assessment, Surveys, Measurement
Kim, YoungKoung; DeCarlo, Lawrence T. – College Board, 2016
Because of concerns about test security, different test forms are typically used across different testing occasions. As a result, equating is necessary in order to get scores from the different test forms that can be used interchangeably. In order to assure the quality of equating, multiple equating methods are often examined. Various equity…
Descriptors: Equated Scores, Evaluation Methods, Sampling, Statistical Inference
Brennan, Robert L. – Measurement: Interdisciplinary Research and Perspectives, 2015
Koretz, in his article published in this issue, provides compelling arguments that the high stakes currently associated with accountability testing lead to behavioral changes in students, teachers, and other stakeholders that often have negative consequences, such as inflated scores. Koretz goes on to argue that these negative consequences require…
Descriptors: Accountability, High Stakes Tests, Behavior Change, Student Behavior
Dorans, Neil J.; Middleton, Kyndra – Journal of Educational Measurement, 2012
The interpretability of score comparisons depends on the design and execution of a sound data collection plan and the establishment of linkings between these scores. When comparisons are made between scores from two or more assessments that are built to different specifications and are administered to different populations under different…
Descriptors: Tests, Equated Scores, Test Interpretation, Validity
Babcock, Ben; Albano, Anthony; Raymond, Mark – Educational and Psychological Measurement, 2012
The authors introduced nominal weights mean equating, a simplified version of Tucker equating, as an alternative for dealing with very small samples. The authors then conducted three simulation studies to compare nominal weights mean equating to six other equating methods under the nonequivalent groups anchor test design with sample sizes of 20,…
Descriptors: Equated Scores, Methods, Sample Size, Simulation
Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014
In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…
Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory
Guo, Hongwen; Oh, Hyeonjoo J.; Eignor, Daniel – Journal of Educational Measurement, 2013
In operational equating situations, frequency estimation equipercentile equating is considered only when the old and new groups have similar abilities. The frequency estimation assumptions are investigated in this study under various situations from both the levels of theoretical interest and practical use. It shows that frequency estimation…
Descriptors: Equated Scores, Computation, Statistical Analysis, Test Items
Taherbhai, Husein; Seo, Daeryong – Educational Measurement: Issues and Practice, 2013
Calibration and equating is the quintessential necessity for most large-scale educational assessments. However, there are instances when no consideration is given to the equating process in terms of context and substantive realization, and the methods used in its execution. In the view of the authors, equating is not merely an exhibit of the…
Descriptors: Item Response Theory, Equated Scores, Measurement, Educational Assessment
Moses, Tim – Educational Measurement: Issues and Practice, 2014
This module describes and extends X-to-Y regression measures that have been proposed for use in the assessment of X-to-Y scaling and equating results. Measures are developed that are similar to those based on prediction error in regression analyses but that are directly suited to interests in scaling and equating evaluations. The regression and…
Descriptors: Scaling, Regression (Statistics), Equated Scores, Comparative Analysis
Huggins, Anne C.; Penfield, Randall D. – Educational Measurement: Issues and Practice, 2012
A goal for any linking or equating of two or more tests is that the linking function be invariant to the population used in conducting the linking or equating. Violations of population invariance in linking and equating jeopardize the fairness and validity of test scores, and pose particular problems for test-based accountability programs that…
Descriptors: Equated Scores, Tests, Test Bias, Validity
Guo, Hongwen; Puhan, Gautam; Walker, Michael – ETS Research Report Series, 2013
In this study we investigated when an equating conversion line is problematic in terms of gaps and clumps. We suggest using the conditional standard error of measurement (CSEM) to measure the scale scores that are inappropriate in the overall raw-to-scale transformation.
Descriptors: Equated Scores, Test Items, Evaluation Criteria, Error of Measurement
van der Linden, Wim J. – Journal of Educational Measurement, 2013
In spite of all of the technical progress in observed-score equating, several of the more conceptual aspects of the process still are not well understood. As a result, the equating literature struggles with rather complex criteria of equating, lack of a test-theoretic foundation, confusing terminology, and ad hoc analyses. A return to Lord's…
Descriptors: Equated Scores, Statistical Analysis, Computation, Data Collection
Ho, Andrew – Measurement: Interdisciplinary Research and Perspectives, 2013
In his thoughtful focus article, Haertel (this issue) pushes testing experts to broaden the scope of their validation efforts and to invite scholars from other disciplines to join them. He credits existing validation frameworks for helping the measurement community to identify incomplete or nonexistent validity arguments. However, he notes his…
Descriptors: Educational Testing, Scores, Test Use, Test Validity
Cheng, Ying; Chen, Peihua; Qian, Jiahe; Chang, Hua-Hua – Applied Psychological Measurement, 2013
Differential item functioning (DIF) analysis is an important step in the data analysis of large-scale testing programs. Nowadays, many such programs endorse matrix sampling designs to reduce the load on examinees, such as the balanced incomplete block (BIB) design. These designs pose challenges to the traditional DIF analysis methods. For example,…
Descriptors: Test Bias, Equated Scores, Test Items, Effect Size

Peer reviewed
Direct link
