Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 5 |
Descriptor
Difficulty Level | 8 |
Equated Scores | 8 |
Item Response Theory | 5 |
Test Items | 4 |
Error of Measurement | 3 |
Sample Size | 3 |
Computation | 2 |
Multiple Choice Tests | 2 |
Scaling | 2 |
Test Bias | 2 |
Test Construction | 2 |
More ▼ |
Source
Applied Measurement in… | 8 |
Author
Antal, Judit | 1 |
Bjermo, Jonas | 1 |
Grabovsky, Irina | 1 |
Green, Donald Ross | 1 |
Haertel, Edward H. | 1 |
Jurich, Daniel | 1 |
Lee, Won-Chan | 1 |
Lim, Euijin | 1 |
Liu, Chunyan | 1 |
Melican, Gerald J. | 1 |
Michaelides, Michalis P. | 1 |
More ▼ |
Publication Type
Journal Articles | 8 |
Reports - Research | 7 |
Reports - Evaluative | 1 |
Education Level
Grade 8 | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Liu, Chunyan; Jurich, Daniel; Morrison, Carol; Grabovsky, Irina – Applied Measurement in Education, 2021
The existence of outliers in the anchor items can be detrimental to the estimation of examinee ability and undermine the validity of score interpretation across forms. However, in practice, anchor item performance can become distorted due to various reasons. This study compares the performance of modified "INFIT" and "OUTFIT"…
Descriptors: Equated Scores, Test Items, Item Response Theory, Difficulty Level
Bjermo, Jonas; Miller, Frank – Applied Measurement in Education, 2021
In recent years, the interest in measuring growth in student ability in various subjects between different grades in school has increased. Therefore, good precision in the estimated growth is of importance. This paper aims to compare estimation methods and test designs when it comes to precision and bias of the estimated growth of mean ability…
Descriptors: Scaling, Ability, Computation, Test Items
Lim, Euijin; Lee, Won-Chan – Applied Measurement in Education, 2020
The purpose of this study is to address the necessity of subscore equating and to evaluate the performance of various equating methods for subtests. Assuming the random groups design and number-correct scoring, this paper analyzed real data and simulated data with four study factors including test dimensionality, subtest length, form difference in…
Descriptors: Equated Scores, Test Length, Test Format, Difficulty Level
Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014
In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…
Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory
Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014
The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…
Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference

Green, Donald Ross; And Others – Applied Measurement in Education, 1989
Potential benefits of using item response theory in test construction are evaluated using the experience and evidence accumulated during nine years of using a three-parameter model in the development of major achievement batteries. Topics addressed include error of measurement, test equating, item bias, and item difficulty. (TJH)
Descriptors: Achievement Tests, Computer Assisted Testing, Difficulty Level, Equated Scores

Wang, Xiang-bo; And Others – Applied Measurement in Education, 1995
An experiment is reported in which 225 high school students were asked to choose among several multiple-choice items but then were required to answer them all. It is concluded that allowing choice while having fair tests is only possible when choice is irrelevant in terms of difficulty. (SLD)
Descriptors: Adaptive Testing, Difficulty Level, Equated Scores, High School Students

Norcini, John; And Others – Applied Measurement in Education, 1994
Whether anchor item sets varying in difficulty and discrimination affect precision of cutting score equivalents generated through judge rescaling as much as equivalents from score equating was studied with 4 groups of experts and 250 and 1,000 examinees. Results indicate the robustness of judge rescaling and its superiority over equating. (SLD)
Descriptors: Cutting Scores, Decision Making, Difficulty Level, Equated Scores