NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 26 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sun, Ting; Kim, Stella Yun – Measurement: Interdisciplinary Research and Perspectives, 2021
In many large testing programs, equipercentile equating has been widely used under a random groups design to adjust test difficulty between forms. However, one thorny issue occurs with equipercentile equating when a particular score has no observed frequency. The purpose of this study is to suggest and evaluate six potential methods in…
Descriptors: Equated Scores, Test Length, Sample Size, Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Cui, Zhongmin; He, Yong – Measurement: Interdisciplinary Research and Perspectives, 2023
Careful considerations are necessary when there is a need to choose an anchor test form from a list of old test forms for equating under the random groups design. The choice of the anchor form potentially affects the accuracy of equated scores on new test forms. Few guidelines, however, can be found in the literature on choosing the anchor form.…
Descriptors: Test Format, Equated Scores, Best Practices, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Dimitrov, Dimiter M.; Atanasov, Dimitar V. – Measurement: Interdisciplinary Research and Perspectives, 2021
This study offers an approach to test equating under the latent D-scoring method (DSM-L) using the nonequivalent groups with anchor tests (NEAT) design. The accuracy of the test equating was examined via a simulation study under a 3 × 3 design by two conditions: group ability at three levels and test difficulty at three levels. The results for…
Descriptors: Equated Scores, Scoring, Test Items, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Malatesta, Jaime; Lee, Won-Chan – Measurement: Interdisciplinary Research and Perspectives, 2019
This article reviews several software programs designed to conduct item response theory (IRT) scale linking and equating. The programs reviewed include IRTEQ, STUIRT, and POLYEQUATE. Features and functionalities of each program are discussed and an example analysis using the common-item non-equivalent groups design in IRTEQ is provided.
Descriptors: Item Response Theory, Equated Scores, Computer Software, Computer Interfaces
Peer reviewed Peer reviewed
Direct linkDirect link
Zheng, Xiaying; Yang, Ji Seung – Measurement: Interdisciplinary Research and Perspectives, 2021
The purpose of this paper is to briefly introduce two most common applications of multiple group item response theory (IRT) models, namely detecting differential item functioning (DIF) analysis and nonequivalent group score linking with a simultaneous calibration. We illustrate how to conduct those analyses using the "Stata" item…
Descriptors: Item Response Theory, Test Bias, Computer Software, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E. – Measurement: Interdisciplinary Research and Perspectives, 2018
A key part of determining cut-scores when performing Angoff standard setting is utilizing equating methods to place standard-setting ratings onto the scale used to report scores to examinees. This article describes three equating methods that can be employed to place Angoff ratings onto the scale used to report scores to examinees when applying…
Descriptors: Standard Setting (Scoring), Equated Scores, Probability, Regression (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Albano, Anthony D.; Christ, Theodore J.; Cai, Liuhan – Measurement: Interdisciplinary Research and Perspectives, 2018
Traditional psychometric methods have primarily been developed and applied in the context of high-stakes, large-scale testing. However, these methods are increasingly being used with classroom assessments, including progress monitoring measures where numerous test forms are administered over the course of an academic year. This article provides an…
Descriptors: Progress Monitoring, Hierarchical Linear Modeling, Equated Scores, Raw Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Kane, Michael – Measurement: Interdisciplinary Research and Perspectives, 2015
Michael Kane writes in this article that he is in more or less complete agreement with Professor Koretz's characterization of the problem outlined in the paper published in this issue of "Measurement." Kane agrees that current testing practices are not adequate for test-based accountability (TBA) systems, but he writes that he is far…
Descriptors: Educational Testing, Accountability, Standardized Tests, Equated Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Koretz, Daniel – Measurement: Interdisciplinary Research and Perspectives, 2015
Accountability has become a primary function of large-scale testing in the United States. The pressure on educators to raise scores is vastly greater than it was several decades ago. Research has shown that high-stakes testing can generate behavioral responses that inflate scores, often severely. I argue that because of these responses, using…
Descriptors: Accountability, Educational Testing, Test Construction, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Strietholt, Rolf; Rosén, Monica – Measurement: Interdisciplinary Research and Perspectives, 2016
Since the start of the new millennium, international comparative large-scale studies have become one of the most well-known areas in the field of education. However, the International Association for the Evaluation of Educational Achievement (IEA) has already been conducting international comparative studies for about half a century. The present…
Descriptors: Reading Tests, Comparative Analysis, Comparative Education, Trend Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Brennan, Robert L. – Measurement: Interdisciplinary Research and Perspectives, 2015
Koretz, in his article published in this issue, provides compelling arguments that the high stakes currently associated with accountability testing lead to behavioral changes in students, teachers, and other stakeholders that often have negative consequences, such as inflated scores. Koretz goes on to argue that these negative consequences require…
Descriptors: Accountability, High Stakes Tests, Behavior Change, Student Behavior
Peer reviewed Peer reviewed
Direct linkDirect link
Ho, Andrew – Measurement: Interdisciplinary Research and Perspectives, 2013
In his thoughtful focus article, Haertel (this issue) pushes testing experts to broaden the scope of their validation efforts and to invite scholars from other disciplines to join them. He credits existing validation frameworks for helping the measurement community to identify incomplete or nonexistent validity arguments. However, he notes his…
Descriptors: Educational Testing, Scores, Test Use, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Raykov, Tenko – Measurement: Interdisciplinary Research and Perspectives, 2010
Mroch, Suh, Kane, & Ripkey (2009); Suh, Mroch, Kane, & Ripkey (2009); and Kane, Mroch, Suh, & Ripkey (2009) provided elucidating discussions on critical properties of linear equating methods under the nonequivalent groups with anchor test (NEAT) design. In this popular equating design, two test forms are administered to different…
Descriptors: Equated Scores, Test Items, Factor Analysis, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Brennan, Robert L. – Measurement: Interdisciplinary Research and Perspectives, 2010
This excellent set of papers is comprehensive and very well written. The Kane et al. paper lays out the theory for linear equating with the NEAT design using a clever but simple framework. The Suh et al. paper is an excellent empirical study of the various methods. The Mroch et al. paper provides an insightful evaluation of the methods as…
Descriptors: Equated Scores, Evaluation Methods, Psychometrics, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Kane, Michael T.; Mroch, Andrew A.; Suh, Youngsuk; Ripkey, Douglas R. – Measurement: Interdisciplinary Research and Perspectives, 2010
This article presents the authors' rejoinder to commentaries on linear equating and the NEAT design. The authors appreciate the insightful work of the commentary writers. Each has made a number of interesting points, many of which the authors had not considered at all. Before responding to some of those points, the authors reiterate what they see…
Descriptors: Weighted Scores, Equated Scores, Models, Scores
Previous Page | Next Page »
Pages: 1  |  2