Publication Date
In 2025 | 36 |
Since 2024 | 130 |
Since 2021 (last 5 years) | 469 |
Since 2016 (last 10 years) | 875 |
Since 2006 (last 20 years) | 1355 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Practitioners | 195 |
Teachers | 160 |
Researchers | 93 |
Administrators | 50 |
Students | 34 |
Policymakers | 15 |
Parents | 12 |
Counselors | 2 |
Community | 1 |
Media Staff | 1 |
Support Staff | 1 |
More ▼ |
Location
Canada | 62 |
Turkey | 59 |
Germany | 40 |
United Kingdom | 36 |
Australia | 35 |
Japan | 35 |
China | 32 |
United States | 32 |
California | 25 |
United Kingdom (England) | 25 |
Netherlands | 24 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Hambleton, Ronald K.; Simon, Robert A. – 1980
The subject of constructing criterion-referenced tests is often researched, but many technical problems remain to be satisfactorily resolved. Foremost, criterion-referenced test developers need a comprehensive set of steps for construction. In this paper, 14 logical steps for building criterion-referenced tests that refer to several different…
Descriptors: Criterion Referenced Tests, Cutting Scores, Guidelines, Scoring

Jeffrey, I. M. W.; Grieve, A. R. – Medical Teacher, 1987
The format of the final examinations in Conservative Dentistry in the Dental Schools of Great Britain and Ireland was investigated by means of a questionnaire sent to each of the dental schools in each country. Results are reported and the value of different parts of the examinations are discussed. (Author/RH)
Descriptors: Dentistry, Foreign Countries, Higher Education, Medical Education

Price, James H.; And Others – Journal of School Health, 1985
This study examined the validity and reliability of a short obesity knowledge scale. A 12-item test was developed covering etiology of obesity, diseases related to obesity, weight loss techniques, and general information on obesity. Four test formats were compared, revealing that the scale needs further validation. (Author/MT)
Descriptors: Dietetics, Health Education, Higher Education, Norm Referenced Tests
Van Gendt, Kitty; Verhagen, Plon – 2001
An experiment was conducted to investigate the influence of the variables "realism" and "context" on the performance of biology students on a visual test about the anatomy of a rat. The instruction was primarily visual with additional verbal information like Latin names and practical information about the learning task: dissecting a rat to gain…
Descriptors: Anatomy, Biology, Context Clues, Guidelines
Wang, Shudong; Wang, Ning; Hoadley, David – 2003
This study examined the comparability of scores on the National Nurses Aides Assessment Program (NNAAP) test across language and administration condition groups for calibration and validation samples that were randomly drawn from the same population. A sample of 20,568 candidate responses to 1 test form was used. This examination is given in…
Descriptors: Audio Equipment, Certification, Construct Validity, English
Koretz, Daniel; Hamilton, Laura – 1999
An earlier study (D. Koretz, 1997) found that Kentucky had been unusually successful in testing most students with disabilities, but it also found numerous signs of poor measurement, including differential item functioning (DIF) in mathematics, apparently excessive use of accommodations, and implausibly high mean scores for some groups of students…
Descriptors: Disabilities, Elementary Secondary Education, Item Bias, Scores
Habick, Timothy – 1999
With the advent of computer-based testing (CBT) and the need to increase the number of items available in computer adaptive test pools, the idea of item variants was conceived. An item variant can be defined as an item with content based on an existing item to a greater or lesser degree. Item variants were first proposed as a way to enhance test…
Descriptors: Adaptive Testing, Computer Assisted Testing, Item Banks, Test Construction
Pommerich, Mary; Nicewander, W. Alan – 1998
A simulation study was performed to determine whether a group's average percent correct in a content domain could be accurately estimated for groups taking a single test form and not the entire domain of items. Six Item Response Theory (IRT)-based domain score estimation methods were evaluated, under conditions of few items per content area per…
Descriptors: Ability, Estimation (Mathematics), Groups, Item Response Theory

Adler, Nurit; Guttman, Ruth – Educational and Psychological Measurement, 1982
Thirteen ability tests were administered as defined within a mapping sentence containing four content facets: rule type, expression mode, language of communication and dimensionality of portrayed object. Smallest Space Analysis of intercorrelations among test scores showed the radex structure of the two-dimensional space conformed to the…
Descriptors: Content Analysis, Factor Structure, Intelligence Tests, Scores

Simon, Alan J.; Joiner, Lee M. – Journal of Educational Measurement, 1976
The purpose of this study was to determine whether a Mexican version of the Peabody Picture Vocabulary Test could be improved by directly translating both forms of the American test, then using decision procedures to select the better item of each pair. The reliability of the simple translations suffered. (Author/BW)
Descriptors: Early Childhood Education, Spanish, Test Construction, Test Format

Bardo, J.W.; Yeager, S.J. – Perceptual and Motor Skills, 1982
In examining response style effects on various commonly used fixed-response formats, Likert-type formats were relatively consistently affected regardless of the number of format categories. Nonanchored numbers were less affected. Across types, strong correlations for the linear formats and human faces made their use problematic. (Author/CM)
Descriptors: Higher Education, Objective Tests, Response Style (Tests), Student Reaction

Schriesheim, Chester A. – Educational and Psychological Measurement, 1981
Effects of item presentation mode on degree of leniency bias in responses to field research questionnaires were studied. Two modes were examined: first with items measuring the same dimensions grouped together and second with such items distributed randomly. The random mode showed substantially less leniency response bias. (Author/BW)
Descriptors: Adults, Leadership Qualities, Questionnaires, Response Style (Tests)

Silverstein, A. B. – Journal of Consulting and Clinical Psychology, 1982
Assessed the validity of short forms that reduce the number of items within subtests rather than the number of subtests. Used data from the standardization samples for the Wechsler Intelligence Scale for Children, Wechsler Adult Intelligence Scale, Wechsler Preschool and Primary Scale of Intelligence, WISC-Revised, and WAIS-Revised. (Author)
Descriptors: Correlation, Intelligence Tests, Mathematical Formulas, Test Format

Masters, Geoff N. – Psychometrika, 1982
An extension of the Rasch model for partial credit scoring of test items is presented. An unconditional maximum likelihood procedure for estimating the model parameters is developed. The relationship of this model to Andrich's Rating Scale model and Samejima's Graded Response model are discussed. (Author/JKS)
Descriptors: Item Analysis, Latent Trait Theory, Maximum Likelihood Statistics, Measurement Techniques

Schriesheim, Chester A.; Hill, Kenneth D. – Educational and Psychological Measurement, 1981
The empirical evidence does not support the prevailing conventional wisdom that it is advisable to mix positively and negatively worded items in psychological measures to counteract acquiescence response bias. An experiment, evaluating subjects' ability to respond accurately to both positive and reversed items on a questionnaire, analyzed post-hoc…
Descriptors: Bias, Higher Education, Questionnaires, Response Style (Tests)