Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 1 |
Descriptor
Item Sampling | 12 |
Test Items | 12 |
Test Validity | 12 |
Test Construction | 6 |
Criterion Referenced Tests | 4 |
Item Analysis | 4 |
Test Reliability | 4 |
Testing Problems | 4 |
Achievement Tests | 3 |
Difficulty Level | 3 |
Language Tests | 3 |
More ▼ |
Author
Berk, Ronald A. | 1 |
Boyd, Thomas A. | 1 |
Derya Çobanoglu Aktan | 1 |
Forster, Fred | 1 |
Gifford, Janice A. | 1 |
Graham, Darol L. | 1 |
Hambleton, Ronald K. | 1 |
Hartke, Alan R. | 1 |
Hoste, R. | 1 |
Linn, Robert | 1 |
Mason, Victor W. | 1 |
More ▼ |
Publication Type
Reports - Research | 6 |
Speeches/Meeting Papers | 6 |
Reports - Evaluative | 4 |
Journal Articles | 2 |
Education Level
Audience
Researchers | 2 |
Location
Netherlands | 1 |
United Kingdom (England) | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Wechsler Intelligence Scale… | 1 |
Wechsler Intelligence Scales… | 1 |
What Works Clearinghouse Rating
Süleyman Demir; Derya Çobanoglu Aktan; Nese Güler – International Journal of Assessment Tools in Education, 2023
This study has two main purposes. Firstly, to compare the different item selection methods and stopping rules used in Computerized Adaptive Testing (CAT) applications with simulative data generated based on the item parameters of the Vocational Maturity Scale. Secondly, to test the validity of CAT application scores. For the first purpose,…
Descriptors: Computer Assisted Testing, Adaptive Testing, Vocational Maturity, Measures (Individuals)

Hartke, Alan R. – Journal of Educational Measurement, 1978
Latent partition analysis is shown to be useful in determining the conceptual homogeneity of an item population. Such item populations are useful for mastery testing. Applications of latent partition analysis in assessing content validity are suggested. (Author/JKS)
Descriptors: Higher Education, Item Analysis, Item Sampling, Mastery Tests

Hoste, R. – British Journal of Educational Psychology, 1981
In this paper, a proposal is made by which a content validity coefficient can be calculated. An example of the use of the coefficient is given, demonstrating that different question combinations in a CSE biology examination in which a choice of questions was given gave different levels of content validity. (Author)
Descriptors: Achievement Tests, Biology, Content Analysis, Item Sampling
Linn, Robert – 1978
A series of studies on conceptual and design problems in competency-based measurements are explained. The concept of validity within the context of criterion-referenced measurement is reviewed. The authors believe validation should be viewed as a process rather than an end product. It is the process of marshalling evidence to support…
Descriptors: Criterion Referenced Tests, Item Analysis, Item Sampling, Test Bias
Graham, Darol L. – 1974
The adequacy of a test developed for statewide assessment of basic mathematics skills was investigated. The test, comprised of multiple-choice items reflecting a series of behavioral objectives, was compared with a more extensive criterion measure generated from the same objectives by the application of a strict item sampling model. In many…
Descriptors: Comparative Testing, Criterion Referenced Tests, Educational Assessment, Item Sampling
Berk, Ronald A. – 1978
Sixteen item statistics recommended for use in the development of criterion-referenced tests were evaluated. There were two major criteria: (1) practicability in terms of ease of computation and interpretation and (2) meaningfulness in the context of the development process. Most of the statistics were based on a comparison of performance changes…
Descriptors: Achievement Tests, Criterion Referenced Tests, Difficulty Level, Guides
Gifford, Janice A.; Hambleton, Ronald K. – 1980
Technical considerations associated with item selection and reliability assessment are considered in relation to criterion-referenced tests constructed to provide group information. The purpose is to emphasize test building and the evaluation of test scores in program evaluation studies. It is stressed that an evaluator employ a performance or…
Descriptors: Criterion Referenced Tests, Group Testing, Item Sampling, Models
Theunissen, Phiel J. J. M. – 1983
Any systematic approach to the assessment of students' ability implies the use of a model. The more explicit the model is, the more its users know about what they are doing and what the consequences are. The Rasch model is a strong model where measurement is a bonus of the model itself. It is based on four ideas: (1) separation of observable…
Descriptors: Ability Grouping, Difficulty Level, Evaluation Criteria, Item Sampling
Forster, Fred – 1987
Studies carried out over a 12-year period addressed fundamental questions on the use of Rasch-based item banks. Large field tests administered in grades 3-8 of reading, mathematics, and science items, as well as standardized test results were used to explore the possible effects of many factors on item calibrations. In general, the results…
Descriptors: Achievement Tests, Difficulty Level, Elementary Education, Item Analysis
Boyd, Thomas A.; Tramontana, Michael G. – 1984
To examine the validity of short forms of the Wechsler Intelligence Scale for Children-Revised (WISC-R), the WISC-R was first administered to 106 hospitalized psychiatric patients, aged 8-16. No subjects had a primary diagnosis of mental retardation or learning disability, and one-third were receiving psychotropic medication. WISC-R IQ scores…
Descriptors: Adolescents, Children, Correlation, Elementary Secondary Education
Mason, Victor W. – 1986
Reading skills are crucial to students learning and using English as a second language for academic purposes. Teachers can construct valid reading tests if they approach the task with care and focus on the test's ability to measure construct rather than face validity. In reading tests, the crucial elements of test design affecting validity are (1)…
Descriptors: Communicative Competence (Languages), English for Academic Purposes, English (Second Language), Higher Education
de Jong, John H. A. L. – 1982
The development and validation of a test of listening comprehension for English as a second language at the Dutch National Institute for Educational Measurement (Cito) is described. The test uses two distinct item formats: true-false items and modified cloze items with two options. Both item formats were found to measure foreign language listening…
Descriptors: Cloze Procedure, English (Second Language), Evaluation Criteria, Foreign Countries