NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)0
Since 2006 (last 20 years)25
Source
Applied Psychological…69
Audience
Location
Australia1
Israel1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 69 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Cheng, Ying; Chen, Peihua; Qian, Jiahe; Chang, Hua-Hua – Applied Psychological Measurement, 2013
Differential item functioning (DIF) analysis is an important step in the data analysis of large-scale testing programs. Nowadays, many such programs endorse matrix sampling designs to reduce the load on examinees, such as the balanced incomplete block (BIB) design. These designs pose challenges to the traditional DIF analysis methods. For example,…
Descriptors: Test Bias, Equated Scores, Test Items, Effect Size
Peer reviewed Peer reviewed
Direct linkDirect link
DeMars, Christine E.; Jurich, Daniel P. – Applied Psychological Measurement, 2012
The nonequivalent groups anchor test (NEAT) design is often used to scale item parameters from two different test forms. A subset of items, called the anchor items or common items, are administered as part of both test forms. These items are used to adjust the item calibrations for any differences in the ability distributions of the groups taking…
Descriptors: Computer Software, Item Response Theory, Scaling, Equated Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Brossman, Bradley G.; Lee, Won-Chan – Applied Psychological Measurement, 2013
The purpose of this research was to develop observed score and true score equating procedures to be used in conjunction with the multidimensional item response theory (MIRT) framework. Three equating procedures--two observed score procedures and one true score procedure--were created and described in detail. One observed score procedure was…
Descriptors: Equated Scores, True Scores, Item Response Theory, Mathematics Tests
Peer reviewed Peer reviewed
Direct linkDirect link
He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei – Applied Psychological Measurement, 2013
Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…
Descriptors: Regression (Statistics), Item Response Theory, Test Items, Equated Scores
Peer reviewed Peer reviewed
Direct linkDirect link
van der Linden, Wim J.; Wiberg, Marie – Applied Psychological Measurement, 2010
For traditional methods of observed-score equating with anchor-test designs, such as chain and poststratification equating, it is difficult to satisfy the criteria of equity and population invariance. Their equatings are therefore likely to be biased. The bias in these methods was evaluated against a simple local equating method in which the…
Descriptors: Methods, Equated Scores, Test Items, Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Jurich, Daniel P.; DeMars, Christine E.; Goodman, Joshua T. – Applied Psychological Measurement, 2012
The prevalence of high-stakes test scores as a basis for significant decisions necessitates the dissemination of accurate and fair scores. However, the magnitude of these decisions has created an environment in which examinees may be prone to resort to cheating. To reduce the risk of cheating, multiple test forms are commonly administered. When…
Descriptors: High Stakes Tests, Scores, Prevention, Cheating
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E.; Reckase, Mark D. – Applied Psychological Measurement, 2011
An essential concern in the application of any equating procedure is determining whether tests can be considered equated after the tests have been placed onto a common scale. This article clarifies one equating criterion, the first-order equity property of equating, and develops a new method for evaluating equating that is linked to this…
Descriptors: Lawyers, Licensing Examinations (Professions), Testing Programs, Graphs
Peer reviewed Peer reviewed
Direct linkDirect link
Moses, Tim; Deng, Weiling; Zhang, Yu-Li – Applied Psychological Measurement, 2011
Nonequivalent groups with anchor test (NEAT) equating functions that use a single anchor can have accuracy problems when the groups are extremely different and/or when the anchor weakly correlates with the tests being equated. Proposals have been made to address these issues by incorporating more than one anchor into NEAT equating functions. These…
Descriptors: Equated Scores, Tests, Comparative Analysis, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Moses, Tim – Applied Psychological Measurement, 2009
This study compared the accuracies of nine previously proposed statistical significance tests for selecting identity, linear, and equipercentile equating functions in an equivalent groups equating design. The strategies included likelihood ratio tests for the loglinear models of tests' frequency distributions, regression tests, Kolmogorov-Smirnov…
Descriptors: Statistical Significance, Equated Scores, Comparative Analysis, Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Garcia-Perez, Miguel A.; Alcala-Quintana, Rocio; Garcia-Cueto, Eduardo – Applied Psychological Measurement, 2010
Current interest in measuring quality of life is generating interest in the construction of computerized adaptive tests (CATs) with Likert-type items. Calibration of an item bank for use in CAT requires collecting responses to a large number of candidate items. However, the number is usually too large to administer to each subject in the…
Descriptors: Comparative Analysis, Test Items, Equated Scores, Item Banks
Peer reviewed Peer reviewed
Direct linkDirect link
Rijmen, Frank; Manalo, Jonathan R.; von Davier, Alina A. – Applied Psychological Measurement, 2009
This article describes two methods for obtaining the standard errors of two commonly used population invariance measures of equating functions: the root mean square difference of the subpopulation equating functions from the overall equating function and the root expected mean square difference. The delta method relies on an analytical…
Descriptors: Error of Measurement, Sampling, Equated Scores, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Han, Kyung T. – Applied Psychological Measurement, 2009
This article provides a brief description of a Windows application called IRTEQ. IRTEQ employs an intuitive, user-friendly graphic user interface that can rescale one test form to another by using various item response theory (IRT) scaling methods. It supports various IRT models for test forms. It can also equate test scores on the scale of one…
Descriptors: Item Response Theory, Scaling, True Scores, Equated Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Tianyou – Applied Psychological Measurement, 2008
Von Davier, Holland, and Thayer (2004) laid out a five-step framework of test equating that can be applied to various data collection designs and equating methods. In the continuization step, they presented an adjusted Gaussian kernel method that preserves the first two moments. This article proposes an alternative continuization method that…
Descriptors: Equated Scores, Models, Data Collection, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Tianyou; Lee, Won-Chan; Brennan, Robert L.; Kolen, Michael J. – Applied Psychological Measurement, 2008
This article uses simulation to compare two test equating methods under the common-item nonequivalent groups design: the frequency estimation method and the chained equipercentile method. An item response theory model is used to define the true equating criterion, simulate group differences, and generate response data. Three linear equating…
Descriptors: Equated Scores, Item Response Theory, Simulation, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Dorans, Neil J.; Liu, Jinghua; Hammond, Shelby – Applied Psychological Measurement, 2008
This exploratory study was built on research spanning three decades. Petersen, Marco, and Stewart (1982) conducted a major empirical investigation of the efficacy of different equating methods. The studies reported in Dorans (1990) examined how different equating methods performed across samples selected in different ways. Recent population…
Descriptors: Test Format, Equated Scores, Sampling, Evaluation Methods
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4  |  5