Publication Date
In 2025 | 205 |
Since 2024 | 705 |
Since 2021 (last 5 years) | 2293 |
Since 2016 (last 10 years) | 4594 |
Since 2006 (last 20 years) | 6899 |
Descriptor
Test Reliability | 14762 |
Test Validity | 9771 |
Test Construction | 4248 |
Foreign Countries | 3657 |
Psychometrics | 2361 |
Factor Analysis | 2251 |
Measures (Individuals) | 1717 |
Evaluation Methods | 1401 |
Higher Education | 1384 |
Correlation | 1234 |
Questionnaires | 1228 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Researchers | 452 |
Practitioners | 319 |
Teachers | 128 |
Administrators | 73 |
Policymakers | 33 |
Counselors | 31 |
Students | 17 |
Parents | 10 |
Community | 6 |
Support Staff | 5 |
Location
Turkey | 797 |
Australia | 236 |
Canada | 205 |
China | 195 |
Indonesia | 142 |
Spain | 124 |
United States | 121 |
United Kingdom | 117 |
Germany | 106 |
Taiwan | 103 |
Netherlands | 99 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 2 |
Meets WWC Standards with or without Reservations | 2 |
Does not meet standards | 1 |

Munby, Hugh – Journal of Research in Science Teaching, 1983
Assesses Scientific Attitude Inventory (SAI) by examining the instrument and 30 studies in which it has been used. In addition, shows how conceptual analysis may be used to investigate validity of a research instrument. Indicates that SAI needs reworking since it is less than certain of what is being measured. (Author/JN)
Descriptors: Attitude Measures, College Science, Elementary School Science, Elementary Secondary Education

Summers, Edward G.; Lukasevich, Ann – Reading Research Quarterly, 1983
Used paired comparison to construct a reading preference inventory based on 14 reading themes drawn from previous research. Tested it using intermediate-grade children in three Canadian communities. (FL)
Descriptors: Developmental Stages, Differences, Females, Foreign Countries

Marsh, Herbert W. – American Educational Research Journal, 1982
Differences in ratings of the same instructor teaching the same course on two different occasions are related to differences in background characteristics: (1) higher levels of workload/difficulty, (2) higher expected grades, and (3) instructor experience. (PN)
Descriptors: Analysis of Variance, Difficulty Level, Educational Environment, Higher Education

Duffelmeyer, Frederick A. – Journal of Reading, 1983
Looks at cloze scores for students at two different schooling stages--grade 5 and grade 10--on materials at their own grade level. Concludes that the cloze score of approximately 40 percent currently used to judge reading ability at all grade levels is not stringent enough for higher grade levels. (FL)
Descriptors: Cloze Procedure, Elementary Secondary Education, Grade 10, Grade 5

Lang, Harry G. – Journal of Research in Science Teaching, 1982
Reliability, validity, and standards-setting procedure for a criterion-referenced test (Test of Metric Skills) were examined for use in science curricula. Results indicate a number of factors influencing test reliability/validity and that science teachers need to be aware of these factors to enhance accuracy of their judgments. (Author/JN)
Descriptors: College Science, Criterion Referenced Tests, Higher Education, Science Education
Gipps, Caroline – Child Psychology and Psychiatry and Allied Disciplines, 1982
Investigates differences in attitudes toward children and parents among nursery nurses and teachers in three combined nursery centers, three nursery schools, and six day nurseries. Differences between the professional groups were found. (RH)
Descriptors: Age Differences, Attitude Measures, Child Caregivers, Comparative Analysis

Wilcox, Rand R. – Journal of Experimental Education, 1982
A closed sequential procedure for estimating true score is proposed for use with answer-until-correct tests. The accuracy of determining true score is the same as in conventional sequential solutions, but the possibility of using an unnecessarily large number of items is eliminated. (Author/CM)
Descriptors: Answer Sheets, Guessing (Tests), Item Banks, Measurement Techniques

Kahn, Paul; Ribner, Sol – Psychology in the Schools, 1982
Developed a brief behavior rating scale consisting of 28 items divided into seven categories for use in a school setting. Test validity was based upon the successful discrimination between neurologically impaired, socially maladjusted, emotionally handicapped, and normal children. (Author)
Descriptors: Behavior Patterns, Behavior Rating Scales, Children, Classification

And Others; Mann, Irene T. – Applied Psychological Measurement, 1979
Several methodological problems (particularly the assumed bipolarity of scales, instructions regarding use of the midpoint, and concept-scale interaction) which may contribute to a lack of precision in the semantic differential technique were investigated. Results generally supported the use of the semantic differential. (Author/JKS)
Descriptors: Analysis of Variance, Computer Assisted Testing, Higher Education, Rating Scales

Boyd, Marcia A.; And Others – Journal of Dental Education, 1980
The Dental Aptitude Test (DAT) in Canada and the Dental Admission Test (DAT) in the United States are discussed. A history of the DAT, its composition, underlying concepts, and the problems of validation and guidelines for its uses are presented. A review of the literature is provided. (Author/MLW)
Descriptors: Admission (School), Admission Criteria, Aptitude Tests, College Entrance Examinations

Vandivier, Phillip L.; Vandivier, Stella Sue – Educational Forum, 1979
Arguments and prejudices against the use of individually administered intelligence tests are considered and compared with possible values that may be obtained. Cautions about test score interpretation are discussed. Implications of abolishing intelligence testing are considered and recommendations for effective testing policies are presented. (CTM)
Descriptors: Academic Achievement, Diagnostic Tests, Elementary Secondary Education, Intelligence

Harasym, P. H.; And Others – Evaluation and the Health Professions, 1980
Coded, as opposed to free response items, in a multiple choice physiology test had a cueing effect which raised students' scores, especially for lower achievers. Reliability of coded items was also lower. Item format and scoring method had an effect on test results. (GDC)
Descriptors: Achievement Tests, Comparative Testing, Cues, Higher Education

Albanese, Mark A. – Journal of Medical Education, 1979
Results of a study involving pathology students suggest that there is significant cluing in multiple-true-false test questions that use secondary responses to represent combinations of the primary response (e.g., "Mark B if only 1 and 3 are correct"). Thus test scores are artificially inflated and test reliability is lowered. (JMD)
Descriptors: Allied Health Occupations Education, Cues, Higher Education, Medical Education
Tinari, Frank D. – Improving College and University Teaching, 1979
Computerized analysis of multiple choice test items is explained. Examples of item analysis applications in the introductory economics course are discussed with respect to three objectives: to evaluate learning; to improve test items; and to help improve classroom instruction. Problems, costs and benefits of the procedures are identified. (JMD)
Descriptors: College Instruction, Computer Programs, Discriminant Analysis, Economics Education

Linn, Marcia C.; Rice, Marian – Journal of Educational Measurement, 1979
The Springs task, an individually administered measure of ability to criticize and control experiments, is described. The task has characteristics similar to Inhelder and Piaget's Bending Rods task; it yields scores on naming variables, controlling variables, and analyzing experiments. (See also RIE: ED 163 092.) (JKS)
Descriptors: Cognitive Processes, Cognitive Tests, Critical Thinking, Developmental Stages