Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 2 |
Descriptor
Multiple Choice Tests | 7 |
Test Format | 7 |
Equated Scores | 4 |
Item Response Theory | 4 |
Test Construction | 4 |
Mathematics Tests | 3 |
Test Items | 3 |
Educational Assessment | 2 |
Foreign Countries | 2 |
Test Reliability | 2 |
True Scores | 2 |
More ▼ |
Source
Applied Psychological… | 7 |
Author
Birenbaum, Menucha | 1 |
Budescu, David V. | 1 |
Hanson, Bradley A. | 1 |
Hsu, Louis M. | 1 |
Kim, Jee-Seon | 1 |
Nicewander, W. Alan | 1 |
Quenette, Mary A. | 1 |
Thomasson, Gary L. | 1 |
Wang, Wen-chung | 1 |
Wilson, Christine | 1 |
Wilson, Mark | 1 |
More ▼ |
Publication Type
Journal Articles | 7 |
Reports - Evaluative | 4 |
Reports - Research | 3 |
Reports - Descriptive | 1 |
Education Level
High Schools | 1 |
Audience
Location
Israel (Tel Aviv) | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Armed Services Vocational… | 1 |
California Learning… | 1 |
What Works Clearinghouse Rating
von Davier, Alina A.; Wilson, Christine – Applied Psychological Measurement, 2008
Dorans and Holland (2000) and von Davier, Holland, and Thayer (2003) introduced measures of the degree to which an observed-score equating function is sensitive to the population on which it is computed. This article extends the findings of Dorans and Holland and of von Davier et al. to item response theory (IRT) true-score equating methods that…
Descriptors: Advanced Placement, Advanced Placement Programs, Equated Scores, Calculus

Kim, Jee-Seon; Hanson, Bradley A. – Applied Psychological Measurement, 2002
Presents a characteristic curve procedure for comparing transformations of the item response theory ability scale assuming the multiple-choice model. Illustrates the use of the method with an example equating American College Testing mathematics tests. (SLD)
Descriptors: Ability, Equated Scores, Item Response Theory, Mathematics Tests
Quenette, Mary A.; Nicewander, W. Alan; Thomasson, Gary L. – Applied Psychological Measurement, 2006
Model-based equating was compared to empirical equating of an Armed Services Vocational Aptitude Battery (ASVAB) test form. The model-based equating was done using item pretest data to derive item response theory (IRT) item parameter estimates for those items that were retained in the final version of the test. The analysis of an ASVAB test form…
Descriptors: Item Response Theory, Multiple Choice Tests, Test Items, Computation

Wilson, Mark; Wang, Wen-chung – Applied Psychological Measurement, 1995
Data from the California Learning Assessment System mathematics assessment were used to examine issues that arise when scores from different assessment modes are combined. Multiple-choice, open-ended, and investigation items were combined in a test across three test forms. Results illustrate the difficulties faced in evaluating combined…
Descriptors: Educational Assessment, Equated Scores, Evaluation Methods, Item Response Theory

Hsu, Louis M. – Applied Psychological Measurement, 1979
A comparison of the relative ordering power of separate and grouped-items true-false tests indicated that neither type of test was uniformly superior to the other across all levels of knowledge of examinees. Grouped-item tests were found superior for examinees with low levels of knowledge. (Author/CTM)
Descriptors: Academic Ability, Knowledge Level, Multiple Choice Tests, Scores

Birenbaum, Menucha; And Others – Applied Psychological Measurement, 1992
The effect of multiple-choice (MC) or open-ended (OE) response format on diagnostic assessment of algebra test performance was investigated with 231 eighth and ninth graders in Tel Aviv (Israel) using bug or rule space analysis. Both analyses indicated closer similarity between parallel OE subsets than between stem-equivalent OE and MC subsets.…
Descriptors: Algebra, Comparative Testing, Educational Assessment, Educational Diagnosis

Budescu, David V. – Applied Psychological Measurement, 1988
A multiple matching test--a 24-item Hebrew vocabulary test--was examined, in which distractors from several items are pooled into one list at the test's end. Construction of such tests was feasible. Reliability, validity, and reduction of random guessing were satisfactory when applied to data from 717 applicants to Israeli universities. (SLD)
Descriptors: College Applicants, Feasibility Studies, Foreign Countries, Guessing (Tests)