ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	26

Descriptor

Equated Scores	26
Item Response Theory	9
Test Items	9
Educational Testing	8
Regression (Statistics)	8
Comparative Analysis	7
Foreign Countries	7
Psychometrics	7
Evaluation Methods	6
High Stakes Tests	6
Methods	6
Models	6
Scaling	6
Test Construction	6
Test Interpretation	6
Test Use	6
Definitions	5
Educational Assessment	5
Measurement Techniques	5
Predictive Measurement	5
Scores	5
Test Theory	5
Testing Problems	5
Classification	4
Test Validity	4
More ▼

Source

Measurement:…

Publication Type

Journal Articles	26
Opinion Papers	11
Reports - Research	9
Reports - Evaluative	5
Reports - Descriptive	4

Education Level

Elementary Secondary Education	6
Elementary Education	1
Grade 1	1
Secondary Education	1

Audience

Location

United States	4
United Kingdom (England)	3
United Kingdom	2
United Kingdom (Wales)	2
Australia	1
Hungary	1
Italy	1
Sweden	1

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…	2
SAT (College Admission Test)	2
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 26 results Save | Export

Evaluating Six Approaches to Handling Zero-Frequency Scores under Equipercentile Equating

Peer reviewed

Direct link

Sun, Ting; Kim, Stella Yun – Measurement: Interdisciplinary Research and Perspectives, 2021

In many large testing programs, equipercentile equating has been widely used under a random groups design to adjust test difficulty between forms. However, one thorny issue occurs with equipercentile equating when a particular score has no observed frequency. The purpose of this study is to suggest and evaluate six potential methods in…

Descriptors: Equated Scores, Test Length, Sample Size, Methods

Practical Considerations in Choosing an Anchor Test Form for Equating under the Random Groups Design

Peer reviewed

Direct link

Cui, Zhongmin; He, Yong – Measurement: Interdisciplinary Research and Perspectives, 2023

Careful considerations are necessary when there is a need to choose an anchor test form from a list of old test forms for equating under the random groups design. The choice of the anchor form potentially affects the accuracy of equated scores on new test forms. Few guidelines, however, can be found in the literature on choosing the anchor form.…

Descriptors: Test Format, Equated Scores, Best Practices, Test Construction

An Approach to Test Equating under the Latent "D"-Scoring Method

Peer reviewed

Direct link

Dimitrov, Dimiter M.; Atanasov, Dimitar V. – Measurement: Interdisciplinary Research and Perspectives, 2021

This study offers an approach to test equating under the latent D-scoring method (DSM-L) using the nonequivalent groups with anchor tests (NEAT) design. The accuracy of the test equating was examined via a simulation study under a 3 × 3 design by two conditions: group ability at three levels and test difficulty at three levels. The results for…

Descriptors: Equated Scores, Scoring, Test Items, Accuracy

Software Review of IRTEQ, STUIRT, and POLYEQUATE for Item Response Theory Scale Linking and Equating

Peer reviewed

Direct link

Malatesta, Jaime; Lee, Won-Chan – Measurement: Interdisciplinary Research and Perspectives, 2019

This article reviews several software programs designed to conduct item response theory (IRT) scale linking and equating. The programs reviewed include IRTEQ, STUIRT, and POLYEQUATE. Features and functionalities of each program are discussed and an example analysis using the common-item non-equivalent groups design in IRTEQ is provided.

Descriptors: Item Response Theory, Equated Scores, Computer Software, Computer Interfaces

Multiple Group Item Response Theory Applications Using "Stata irt" Package

Peer reviewed

Direct link

Zheng, Xiaying; Yang, Ji Seung – Measurement: Interdisciplinary Research and Perspectives, 2021

The purpose of this paper is to briefly introduce two most common applications of multiple group item response theory (IRT) models, namely detecting differential item functioning (DIF) analysis and nonequivalent group score linking with a simultaneous calibration. We illustrate how to conduct those analyses using the "Stata" item…

Descriptors: Item Response Theory, Test Bias, Computer Software, Statistical Analysis

Equating Angoff Standard-Setting Ratings with the Rasch Model

Peer reviewed

Direct link

Wyse, Adam E. – Measurement: Interdisciplinary Research and Perspectives, 2018

A key part of determining cut-scores when performing Angoff standard setting is utilizing equating methods to place standard-setting ratings onto the scale used to report scores to examinees. This article describes three equating methods that can be employed to place Angoff ratings onto the scale used to report scores to examinees when applying…

Descriptors: Standard Setting (Scoring), Equated Scores, Probability, Regression (Statistics)

Evaluating Equating in Progress Monitoring Measures Using Multilevel Modeling

Peer reviewed

Direct link

Albano, Anthony D.; Christ, Theodore J.; Cai, Liuhan – Measurement: Interdisciplinary Research and Perspectives, 2018

Traditional psychometric methods have primarily been developed and applied in the context of high-stakes, large-scale testing. However, these methods are increasingly being used with classroom assessments, including progress monitoring measures where numerous test forms are administered over the course of an academic year. This article provides an…

Descriptors: Progress Monitoring, Hierarchical Linear Modeling, Equated Scores, Raw Scores

Adapting Accountability Systems to the Limitations of Educational Measurement

Peer reviewed

Direct link

Kane, Michael – Measurement: Interdisciplinary Research and Perspectives, 2015

Michael Kane writes in this article that he is in more or less complete agreement with Professor Koretz's characterization of the problem outlined in the paper published in this issue of "Measurement." Kane agrees that current testing practices are not adequate for test-based accountability (TBA) systems, but he writes that he is far…

Descriptors: Educational Testing, Accountability, Standardized Tests, Equated Scores

Adapting Educational Measurement to the Demands of Test-Based Accountability

Peer reviewed

Direct link

Koretz, Daniel – Measurement: Interdisciplinary Research and Perspectives, 2015

Accountability has become a primary function of large-scale testing in the United States. The pressure on educators to raise scores is vastly greater than it was several decades ago. Research has shown that high-stakes testing can generate behavioral responses that inflate scores, often severely. I argue that because of these responses, using…

Descriptors: Accountability, Educational Testing, Test Construction, Test Validity

Linking Large-Scale Reading Assessments: Measuring International Trends over 40 Years

Peer reviewed

Direct link

Strietholt, Rolf; Rosén, Monica – Measurement: Interdisciplinary Research and Perspectives, 2016

Since the start of the new millennium, international comparative large-scale studies have become one of the most well-known areas in the field of education. However, the International Association for the Evaluation of Educational Achievement (IEA) has already been conducting international comparative studies for about half a century. The present…

Descriptors: Reading Tests, Comparative Analysis, Comparative Education, Trend Analysis

Testing for Accountability: A Balancing Act That Challenges Current Testing Practices and Theories

Peer reviewed

Direct link

Brennan, Robert L. – Measurement: Interdisciplinary Research and Perspectives, 2015

Koretz, in his article published in this issue, provides compelling arguments that the high stakes currently associated with accountability testing lead to behavioral changes in students, teachers, and other stakeholders that often have negative consequences, such as inflated scores. Koretz goes on to argue that these negative consequences require…

Descriptors: Accountability, High Stakes Tests, Behavior Change, Student Behavior

The Epidemiology of Modern Test Score Use: Anticipating Aggregation, Adjustment, and Equating

Peer reviewed

Direct link

Ho, Andrew – Measurement: Interdisciplinary Research and Perspectives, 2013

In his thoughtful focus article, Haertel (this issue) pushes testing experts to broaden the scope of their validation efforts and to invite scholars from other disciplines to join them. He credits existing validation frameworks for helping the measurement community to identify incomplete or nonexistent validity arguments. However, he notes his…

Descriptors: Educational Testing, Scores, Test Use, Test Validity

Test Equating under the NEAT Design: A Necessary Condition for Anchor Items

Peer reviewed

Direct link

Raykov, Tenko – Measurement: Interdisciplinary Research and Perspectives, 2010

Mroch, Suh, Kane, & Ripkey (2009); Suh, Mroch, Kane, & Ripkey (2009); and Kane, Mroch, Suh, & Ripkey (2009) provided elucidating discussions on critical properties of linear equating methods under the nonequivalent groups with anchor test (NEAT) design. In this popular equating design, two test forms are administered to different…

Descriptors: Equated Scores, Test Items, Factor Analysis, Models

Assumptions about True-Scores and Populations in Equating

Peer reviewed

Direct link

Brennan, Robert L. – Measurement: Interdisciplinary Research and Perspectives, 2010

This excellent set of papers is comprehensive and very well written. The Kane et al. paper lays out the theory for linear equating with the NEAT design using a clever but simple framework. The Suh et al. paper is an excellent empirical study of the various methods. The Mroch et al. paper provides an insightful evaluation of the methods as…

Descriptors: Equated Scores, Evaluation Methods, Psychometrics, Models

Linear Equating for the NEAT Design: A Rejoinder and Some Further Comments

Peer reviewed

Direct link

Kane, Michael T.; Mroch, Andrew A.; Suh, Youngsuk; Ripkey, Douglas R. – Measurement: Interdisciplinary Research and Perspectives, 2010

This article presents the authors' rejoinder to commentaries on linear equating and the NEAT design. The authors appreciate the insightful work of the commentary writers. Each has made a number of interesting points, many of which the authors had not considered at all. Before responding to some of those points, the authors reiterate what they see…

Descriptors: Weighted Scores, Equated Scores, Models, Scores

Previous Page | Next Page »

Pages: 1 | 2

Kane, Michael T.	4
Mroch, Andrew A.	4
Ripkey, Douglas R.	4
Suh, Youngsuk	4
Brennan, Robert L.	2
Albano, Anthony D.	1
Atanasov, Dimitar V.	1
Baird, Jo-Anne	1
Borsboom, Denny	1
Cai, Liuhan	1
Christ, Theodore J.	1
Cresswell, Mike	1
Cui, Zhongmin	1
Dimitrov, Dimiter M.	1
Dorans, Neil J.	1
He, Yong	1
Ho, Andrew	1
Kane, Michael	1
Kim, Stella Yun	1
Koretz, Daniel	1
Lee, Won-Chan	1
Malatesta, Jaime	1
Maris, Gunter	1
Newton, Paul E.	1
Raykov, Tenko	1
More ▼