Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 1 |
Descriptor
Difficulty Level | 4 |
Scoring | 4 |
Test Length | 4 |
Item Analysis | 3 |
Simulation | 2 |
Test Items | 2 |
Adaptive Testing | 1 |
Career Development | 1 |
Computer Assisted Testing | 1 |
Cost Effectiveness | 1 |
Estimation (Mathematics) | 1 |
More ▼ |
Author
Burton, Richard F. | 1 |
Cook, Linda L. | 1 |
Hambleton, Ronald K. | 1 |
Harris, Dickie A. | 1 |
Melican, Gerald J. | 1 |
Penell, Roger J. | 1 |
Plake, Barbara S. | 1 |
Publication Type
Reports - Research | 4 |
Journal Articles | 2 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Burton, Richard F. – Assessment & Evaluation in Higher Education, 2006
Many academic tests (e.g. short-answer and multiple-choice) sample required knowledge with questions scoring 0 or 1 (dichotomous scoring). Few textbooks give useful guidance on the length of test needed to do this reliably. Posey's binomial error model of 1932 provides the best starting point, but allows neither for heterogeneity of question…
Descriptors: Item Sampling, Tests, Test Length, Test Reliability
Harris, Dickie A.; Penell, Roger J. – 1977
This study used a series of simulations to answer questions about the efficacy of adaptive testing raised by empirical studies. The first study showed that for reasonable high entry points, parameters estimated from paper-and-pencil test protocols cross-validated remarkably well to groups actually tested at a computer terminal. This suggested that…
Descriptors: Adaptive Testing, Computer Assisted Testing, Cost Effectiveness, Difficulty Level
Hambleton, Ronald K.; Cook, Linda L. – 1978
The purpose of the present research was to study, systematically, the "goodness-of-fit" of the one-, two-, and three-parameter logistic models. We studied, using computer-simulated test data, the effects of four variables: variation in item discrimination parameters, the average value of the pseudo-chance level parameters, test length,…
Descriptors: Career Development, Difficulty Level, Goodness of Fit, Item Analysis

Plake, Barbara S.; Melican, Gerald J. – Educational and Psychological Measurement, 1989
The impact of overall test length and difficulty on the expert judgments of item performance by the Nedelsky method were studied. Five university-level instructors predicting the performance of minimally competent candidates on a mathematics examination were fairly consistent in their assessments regardless of length or difficulty of the test.…
Descriptors: Difficulty Level, Estimation (Mathematics), Evaluators, Higher Education