Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 6 |
Descriptor
Difficulty Level | 17 |
Simulation | 17 |
Test Construction | 17 |
Test Items | 13 |
Item Response Theory | 7 |
Adaptive Testing | 5 |
Computer Assisted Testing | 5 |
Psychometrics | 5 |
Ability | 4 |
Equated Scores | 4 |
Item Analysis | 4 |
More ▼ |
Source
Applied Measurement in… | 2 |
Journal of Educational… | 2 |
ETS Research Report Series | 1 |
Journal of Psychoeducational… | 1 |
Author
Schnipke, Deborah L. | 3 |
Reese, Lynda M. | 2 |
Weiss, David J. | 2 |
Antal, Judit | 1 |
Berger, Martijn P. F. | 1 |
Berger, Stéphanie | 1 |
Betz, Nancy E. | 1 |
Clauser, Brian E. | 1 |
Cook, Linda L. | 1 |
Curry, Allen R. | 1 |
Eggen, Theo J. H. M. | 1 |
More ▼ |
Publication Type
Reports - Research | 13 |
Journal Articles | 6 |
Reports - Evaluative | 5 |
Speeches/Meeting Papers | 3 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Berger, Stéphanie; Verschoor, Angela J.; Eggen, Theo J. H. M.; Moser, Urs – Journal of Educational Measurement, 2019
Calibration of an item bank for computer adaptive testing requires substantial resources. In this study, we investigated whether the efficiency of calibration under the Rasch model could be enhanced by improving the match between item difficulty and student ability. We introduced targeted multistage calibration designs, a design type that…
Descriptors: Simulation, Computer Assisted Testing, Test Items, Difficulty Level
Guo, Hongwen; Zu, Jiyun; Kyllonen, Patrick – ETS Research Report Series, 2018
For a multiple-choice test under development or redesign, it is important to choose the optimal number of options per item so that the test possesses the desired psychometric properties. On the basis of available data for a multiple-choice assessment with 8 options, we evaluated the effects of changing the number of options on test properties…
Descriptors: Multiple Choice Tests, Test Items, Simulation, Test Construction
Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018
Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…
Descriptors: Simulation, Decision Making, Test Construction, Validity
Fitzpatrick, Joseph; Skorupski, William P. – Journal of Educational Measurement, 2016
The equating performance of two internal anchor test structures--miditests and minitests--is studied for four IRT equating methods using simulated data. Originally proposed by Sinharay and Holland, miditests are anchors that have the same mean difficulty as the overall test but less variance in item difficulties. Four popular IRT equating methods…
Descriptors: Difficulty Level, Test Items, Comparative Analysis, Test Construction
Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014
In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…
Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory
Meyers, Jason L.; Miller, G. Edward; Way, Walter D. – Applied Measurement in Education, 2009
In operational testing programs using item response theory (IRT), item parameter invariance is threatened when an item appears in a different location on the live test than it did when it was field tested. This study utilizes data from a large state's assessments to model change in Rasch item difficulty (RID) as a function of item position change,…
Descriptors: Test Items, Test Content, Testing Programs, Simulation
Clauser, Brian E.; And Others – 1991
Item bias has been a major concern for test developers during recent years. The Mantel-Haenszel statistic has been among the preferred methods for identifying biased items. The statistic's performance in identifying uniform bias in simulated data modeled by producing various levels of difference in the (item difficulty) b-parameter for reference…
Descriptors: Comparative Testing, Difficulty Level, Item Bias, Item Response Theory
Schnipke, Deborah L. – 2002
A common practice in some certification fields (e.g., information technology) is to draw items from an item pool randomly and apply a common passing score, regardless of the items administered. Because these tests are commonly used, it is important to determine how accurate the pass/fail decisions are for such tests and whether fairly small,…
Descriptors: Decision Making, Difficulty Level, Item Banks, Pass Fail Grading
Reese, Lynda M.; Schnipke, Deborah L. – 1999
A two-stage design provides a way of roughly adapting item difficulty to test-taker ability. All test takers take a parallel stage-one test, and based on their scores, they are routed to tests of different difficulty levels in the second stage. This design provides some of the benefits of standard computer adaptive testing (CAT), such as increased…
Descriptors: Ability, Adaptive Testing, Computer Assisted Testing, Difficulty Level
Schnipke, Deborah L.; Reese, Lynda M. – 1997
Two-stage and multistage test designs provide a way of roughly adapting item difficulty to test-taker ability. All test takers take a parallel stage-one test, and, based on their scores, they are routed to tests of different difficulty levels in subsequent stages. These designs provide some of the benefits of standard computerized adaptive testing…
Descriptors: Ability, Adaptive Testing, Algorithms, Comparative Analysis
Veerkamp, Wim J. J.; Berger, Martijn P. F. – 1994
Items with the highest discrimination parameter values in a logistic item response theory (IRT) model do not necessarily give maximum information. This paper shows which discrimination parameter values (as a function of the guessing parameter and the distance between person ability and item difficulty) give maximum information for the…
Descriptors: Ability, Adaptive Testing, Algorithms, Computer Assisted Testing
Reckase, Mark D. – 1981
One of the major assumptions of latent trait theory is that the items in a test measure a single dimension. This report describes an investigation of procedures for forming a set of items that meet this assumption. Factor analysis, nonmetric multidimensional scaling, cluster analysis and latent trait analysis were applied to simulated and real…
Descriptors: Cluster Analysis, Difficulty Level, Factor Analysis, Guessing (Tests)
Hambleton, Ronald K.; Cook, Linda L. – 1978
The purpose of the present research was to study, systematically, the "goodness-of-fit" of the one-, two-, and three-parameter logistic models. We studied, using computer-simulated test data, the effects of four variables: variation in item discrimination parameters, the average value of the pseudo-chance level parameters, test length,…
Descriptors: Career Development, Difficulty Level, Goodness of Fit, Item Analysis
Betz, Nancy E.; Weiss, David J. – 1974
Monte Carlo simulation procedures were used to study the psychometric characteristics of two two-stage adaptive tests and a conventional "peaked" ability test. Results showed that scores yielded by both two-stage tests better reflected the normal distribution of underlying ability. Ability estimates yielded by one of the two stage tests…
Descriptors: Ability, Academic Ability, Adaptive Testing, Computers
Curry, Allen R.; And Others – 1978
The efficacy of employing subsets of items from a calibrated item pool to estimate the Rasch model person parameters was investigated. Specifically, the degree of invariance of Rasch model ability-parameter estimates was examined across differing collections of simulated items. The ability-parameter estimates were obtained from a simulation of…
Descriptors: Career Development, Difficulty Level, Equated Scores, Error of Measurement
Previous Page | Next Page »
Pages: 1 | 2