Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 3 |
Descriptor
Item Sampling | 10 |
Test Items | 10 |
Test Reliability | 10 |
Test Construction | 6 |
Test Validity | 4 |
Item Analysis | 3 |
Testing Problems | 3 |
Achievement Tests | 2 |
Criterion Referenced Tests | 2 |
Difficulty Level | 2 |
Evaluation Criteria | 2 |
More ▼ |
Source
Assessment & Evaluation in… | 1 |
College Student Journal | 1 |
Journal of Educational… | 1 |
Physical Review Physics… | 1 |
Practical Assessment,… | 1 |
Author
Publication Type
Reports - Research | 7 |
Journal Articles | 5 |
Reports - Evaluative | 3 |
Speeches/Meeting Papers | 3 |
Education Level
Higher Education | 2 |
Postsecondary Education | 1 |
Audience
Researchers | 1 |
Location
Bosnia and Herzegovina | 1 |
Croatia | 1 |
Netherlands | 1 |
Slovenia | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Glamocic, Džana Salibašic; Mešic, Vanes; Neumann, Knut; Sušac, Ana; Boone, William J.; Aviani, Ivica; Hasovic, Elvedin; Erceg, Nataša; Repnik, Robert; Grubelnik, Vladimir – Physical Review Physics Education Research, 2021
Item banks are generally considered the basis of a new generation of educational measurement. In combination with specialized software, they can facilitate the computerized assembling of multiple pre-equated test forms. However, for advantages of item banks to become fully realized it is important that the item banks store a relatively large…
Descriptors: Item Banks, Test Items, Item Response Theory, Item Sampling
Bashkov, Bozhidar M.; Clauser, Jerome C. – Practical Assessment, Research & Evaluation, 2019
Successful testing programs rely on high-quality test items to produce reliable scores and defensible exams. However, determining what statistical screening criteria are most appropriate to support these goals can be daunting. This study describes and demonstrates cost-benefit analysis as an empirical approach to determining appropriate screening…
Descriptors: Test Items, Test Reliability, Evaluation Criteria, Accuracy
Burton, Richard F. – Assessment & Evaluation in Higher Education, 2006
Many academic tests (e.g. short-answer and multiple-choice) sample required knowledge with questions scoring 0 or 1 (dichotomous scoring). Few textbooks give useful guidance on the length of test needed to do this reliably. Posey's binomial error model of 1932 provides the best starting point, but allows neither for heterogeneity of question…
Descriptors: Item Sampling, Tests, Test Length, Test Reliability

Taylor, Annette Kujawski – College Student Journal, 2005
This research examined 2 elements of multiple-choice test construction, balancing the key and optimal number of options. In Experiment 1 the 3 conditions included a balanced key, overrepresentation of a and b responses, and overrepresentation of c and d responses. The results showed that error-patterns were independent of the key, reflecting…
Descriptors: Comparative Analysis, Test Items, Multiple Choice Tests, Test Construction
Linn, Robert – 1978
A series of studies on conceptual and design problems in competency-based measurements are explained. The concept of validity within the context of criterion-referenced measurement is reviewed. The authors believe validation should be viewed as a process rather than an end product. It is the process of marshalling evidence to support…
Descriptors: Criterion Referenced Tests, Item Analysis, Item Sampling, Test Bias

Askegaard, Lewis D.; Umila, Benwardo V. – Journal of Educational Measurement, 1982
Multiple matrix sampling of items and examinees was applied to an 18-item rank order instrument administered to a randomly assigned group and compared to the ordering and ranking of all items by control subjects. High correlations between ranks suggest the methodology may viably reduce respondent effort on long rank ordering tasks. (Author/CM)
Descriptors: Evaluation Methods, Item Sampling, Junior High Schools, Student Reaction
Gifford, Janice A.; Hambleton, Ronald K. – 1980
Technical considerations associated with item selection and reliability assessment are considered in relation to criterion-referenced tests constructed to provide group information. The purpose is to emphasize test building and the evaluation of test scores in program evaluation studies. It is stressed that an evaluator employ a performance or…
Descriptors: Criterion Referenced Tests, Group Testing, Item Sampling, Models
Forster, Fred – 1987
Studies carried out over a 12-year period addressed fundamental questions on the use of Rasch-based item banks. Large field tests administered in grades 3-8 of reading, mathematics, and science items, as well as standardized test results were used to explore the possible effects of many factors on item calibrations. In general, the results…
Descriptors: Achievement Tests, Difficulty Level, Elementary Education, Item Analysis
Wilcox, Rand R. – 1979
Mastery tests are analyzed in terms of the number of skills to be mastered and the number of items per skill, in order that correct decisions of mastery or nonmastery will be made to a desired degree of probability. It is assumed that a random sample of skills will be selected for measurement, that each skill will be measured by the same number of…
Descriptors: Achievement Tests, Cutting Scores, Decision Making, Equivalency Tests
de Jong, John H. A. L. – 1982
The development and validation of a test of listening comprehension for English as a second language at the Dutch National Institute for Educational Measurement (Cito) is described. The test uses two distinct item formats: true-false items and modified cloze items with two options. Both item formats were found to measure foreign language listening…
Descriptors: Cloze Procedure, English (Second Language), Evaluation Criteria, Foreign Countries