Publication Date
In 2025 | 4 |
Since 2024 | 8 |
Since 2021 (last 5 years) | 46 |
Since 2016 (last 10 years) | 119 |
Since 2006 (last 20 years) | 313 |
Descriptor
Source
Author
Kim, Sooyeon | 22 |
von Davier, Alina A. | 21 |
Dorans, Neil J. | 16 |
Liu, Jinghua | 16 |
Kolen, Michael J. | 15 |
Livingston, Samuel A. | 15 |
Moses, Tim | 14 |
Puhan, Gautam | 13 |
Holland, Paul W. | 11 |
Lee, Won-Chan | 11 |
Walker, Michael E. | 10 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 35 |
Practitioners | 2 |
Location
Turkey | 6 |
Canada | 5 |
Australia | 4 |
California | 4 |
Florida | 4 |
Japan | 4 |
Israel | 3 |
Oregon | 3 |
Sweden | 3 |
Texas | 3 |
United Kingdom | 3 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 9 |
Education Consolidation… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Tom Benton – Practical Assessment, Research & Evaluation, 2025
This paper proposes an extension of linear equating that may be useful in one of two fairly common assessment scenarios. One is where different students have taken different combinations of test forms. This might occur, for example, where students have some free choice over the exam papers they take within a particular qualification. In this…
Descriptors: Equated Scores, Test Format, Test Items, Computation
Jianbin Fu; TsungHan Ho; Xuan Tan – Practical Assessment, Research & Evaluation, 2025
Item parameter estimation using an item response theory (IRT) model with fixed ability estimates is useful in equating with small samples on anchor items. The current study explores the impact of three ability estimation methods (weighted likelihood estimation [WLE], maximum a posteriori [MAP], and posterior ability distribution estimation [PST])…
Descriptors: Item Response Theory, Test Items, Computation, Equated Scores
Yusuf Kara; Akihito Kamata; Xin Qiao; Cornelis J. Potgieter; Joseph F. T. Nese – Educational and Psychological Measurement, 2024
Words read correctly per minute (WCPM) is the reporting score metric in oral reading fluency (ORF) assessments, which is popularly utilized as part of curriculum-based measurements to screen at-risk readers and to monitor progress of students who receive interventions. Just like other types of assessments with multiple forms, equating would be…
Descriptors: Oral Reading, Reading Fluency, Models, Reading Rate
Ting Sun; Stella Yun Kim – Educational and Psychological Measurement, 2024
Equating is a statistical procedure used to adjust for the difference in form difficulty such that scores on those forms can be used and interpreted comparably. In practice, however, equating methods are often implemented without considering the extent to which two forms differ in difficulty. The study aims to examine the effect of the magnitude…
Descriptors: Difficulty Level, Data Interpretation, Equated Scores, High School Students
Jiang, Zhehan; Han, Yuting; Xu, Lingling; Shi, Dexin; Liu, Ren; Ouyang, Jinying; Cai, Fen – Educational and Psychological Measurement, 2023
The part of responses that is absent in the nonequivalent groups with anchor test (NEAT) design can be managed to a planned missing scenario. In the context of small sample sizes, we present a machine learning (ML)-based imputation technique called chaining random forests (CRF) to perform equating tasks within the NEAT design. Specifically, seven…
Descriptors: Test Items, Equated Scores, Sample Size, Artificial Intelligence
Yusuf Kara; Akihito Kamata; Xin Qiao; Cornelis J. Potgieter; Joseph F. T. Nese – Grantee Submission, 2023
Words read correctly per minute (WCPM) is the reporting score metric in oral reading fluency (ORF) assessments, which is popularly utilized as part of curriculum-based measurements to screen at-risk readers and to monitor progress of students who receive interventions. Just like other types of assessments with multiple forms, equating would be…
Descriptors: Oral Reading, Reading Fluency, Models, Reading Rate
Kim, Stella Y.; Lee, Won-Chan – Journal of Educational Measurement, 2023
The current study proposed several variants of simple-structure multidimensional item response theory equating procedures. Four distinct sets of data were used to demonstrate feasibility of proposed equating methods for two different equating designs: a random groups design and a common-item nonequivalent groups design. Findings indicated some…
Descriptors: Item Response Theory, Equated Scores, Monte Carlo Methods, Research Methodology
Moses, Tim – Journal of Educational Measurement, 2022
One result of recent changes in testing is that previously established linking frameworks may not adequately address challenges in current linking situations. Test linking through equating, concordance, vertical scaling or battery scaling may not represent linkings for the scores of tests developed to measure constructs differently for different…
Descriptors: Measures (Individuals), Educational Assessment, Test Construction, Comparative Analysis
Tong Wu; Stella Y. Kim; Carl Westine; Michelle Boyer – Journal of Educational Measurement, 2025
While significant attention has been given to test equating to ensure score comparability, limited research has explored equating methods for rater-mediated assessments, where human raters inherently introduce error. If not properly addressed, these errors can undermine score interchangeability and test validity. This study proposes an equating…
Descriptors: Item Response Theory, Evaluators, Error of Measurement, Test Validity
Sun, Ting; Kim, Stella Yun – Measurement: Interdisciplinary Research and Perspectives, 2021
In many large testing programs, equipercentile equating has been widely used under a random groups design to adjust test difficulty between forms. However, one thorny issue occurs with equipercentile equating when a particular score has no observed frequency. The purpose of this study is to suggest and evaluate six potential methods in…
Descriptors: Equated Scores, Test Length, Sample Size, Methods
Fellinghauer, Carolina; Debelak, Rudolf; Strobl, Carolin – Educational and Psychological Measurement, 2023
This simulation study investigated to what extent departures from construct similarity as well as differences in the difficulty and targeting of scales impact the score transformation when scales are equated by means of concurrent calibration using the partial credit model with a common person design. Practical implications of the simulation…
Descriptors: True Scores, Equated Scores, Test Items, Sample Size
Semih Asiret; Seçil Ömür Sünbül – International Journal of Psychology and Educational Studies, 2023
In this study, it was aimed to examine the effect of missing data in different patterns and sizes on test equating methods under the NEAT design for different factors. For this purpose, as part of this study, factors such as sample size, average difficulty level difference between the test forms, difference between the ability distribution,…
Descriptors: Research Problems, Data, Test Items, Equated Scores
Li, Dongmei; Kapoor, Shalini – Educational Measurement: Issues and Practice, 2022
Population invariance is a desirable property of test equating which might not hold when significant changes occur in the test population, such as those brought about by the COVID-19 pandemic. This research aims to investigate whether equating functions are reasonably invariant when the test population is impacted by the pandemic. Based on…
Descriptors: Test Items, Equated Scores, COVID-19, Pandemics
Kim, Sooyeon; Walker, Michael E. – Educational Measurement: Issues and Practice, 2022
Test equating requires collecting data to link the scores from different forms of a test. Problems arise when equating samples are not equivalent and the test forms to be linked share no common items by which to measure or adjust for the group nonequivalence. Using data from five operational test forms, we created five pairs of research forms for…
Descriptors: Ability, Tests, Equated Scores, Testing Problems
Zeynep Uzun; Tuncay Ögretmen – Large-scale Assessments in Education, 2025
This study aimed to evaluate the item model fit by equating the forms of the PISA 2018 mathematics subtest with concurrent common items equating in samples from Türkiye, the UK, and Italy. The answers given in mathematics subtest Forms 2, 8, and 12 were used in this context. Analyzes were performed using the Dichotomous Rasch Model in the WINSTEPS…
Descriptors: Item Response Theory, Test Items, Foreign Countries, Mathematics Tests