Publication Date
| In 2026 | 3 |
| Since 2025 | 656 |
| Since 2022 (last 5 years) | 3157 |
| Since 2017 (last 10 years) | 7398 |
| Since 2007 (last 20 years) | 15036 |
Descriptor
| Test Reliability | 15028 |
| Test Validity | 10265 |
| Reliability | 9757 |
| Foreign Countries | 7137 |
| Test Construction | 4821 |
| Validity | 4191 |
| Measures (Individuals) | 3876 |
| Factor Analysis | 3822 |
| Psychometrics | 3520 |
| Interrater Reliability | 3124 |
| Correlation | 3039 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1326 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 251 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 216 |
| California | 214 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Peer reviewedStrohmer, Douglas C.; And Others – Journal of Counseling Psychology, 1988
Studied how individuals test hypotheses about themselves. Examined extent to which Snyder's bias toward confirmation persists when negative or nonconsistent personal hypothesis is tested. Found negativity or positivity did not affect hypothesis testing directly, though hypothesis consistency did. Found cognitive schematic variable (vulnerability…
Descriptors: Attitudes, Bias, College Students, Depression (Psychology)
Peer reviewedNorcini, John J.; And Others – Evaluation and the Health Professions, 1986
This study compares physician performance on the Computer-Aided Simulation of the Clinical Encounter with peer ratings and performance on multiple choice questions and patient management problems. Results indicate that all formats are equally valid, although multiple choice is the most reliable method of assessment per unit of testing time.…
Descriptors: Certification, Competence, Computer Assisted Testing, Computer Simulation
Peer reviewedMitchell, Karen J.; Molidor, John B. – Educational and Psychological Measurement, 1986
Research reported in this paper considered the construct validity of a trial essay administered in 1985-87 Medical College Admission Test (MCAT). The addition of the essay caused the non-science factor observed in previous MCAT research to be more strongly defined. (Author/LMO)
Descriptors: College Entrance Examinations, Construct Validity, Correlation, Essay Tests
Peer reviewedGreenan, James P. – Career Development for Exceptional Individuals, 1986
Results of field testing for reliability and validity the Generalizable Mathematics Skills Student Self-Ratings (SSR), Teacher Ratings (TR), and Performance Test (PT) assessment instruments with 138 handicapped secondary students in vocational programs and their vocational teachers (N=5) found sufficient content and face validity and relatively…
Descriptors: Disabilities, Mathematics Tests, Performance Tests, Secondary Education
Peer reviewedBaldauf, Richard B., Jr.; And Others – Educational and Psychological Measurement, 1985
The reliability and factorial validity of the Self Concept as a Learner Scale was studied, using 12-year-old Anglo-Australians. Reliability was acceptable for total scale and three subscales (task orientation, problem solving, and class membership), but not motivation. The validity of the factorial subscales was not confirmed. (GDC)
Descriptors: Factor Structure, Foreign Countries, Junior High Schools, Learning
Peer reviewedPierson, Dorothy; And Others – Educational and Psychological Measurement, 1985
The construct validity and reliability of the Porter Needs Satisfaction Questionnaire (adapted) for educators were examined. Results did not support its use as suggested by Porter. Suggestions for its revision and alternate use are presented. (Author/GDC)
Descriptors: Attitude Measures, Elementary Secondary Education, Factor Structure, Job Satisfaction
Peer reviewedDiserens, Deborah; And Others – Journal of Medical Education, 1986
A computer program developed at the University of Pennsylvania School of Medicine presents simulated patient cases and then scores participants' clinical problem-solving in the cases by comparing their performances with those of faculty members. The validity and reliability of this evaluation system was investigated. (Author/MLW)
Descriptors: Clinical Diagnosis, Evaluation Methods, Graduate Medical Students, Higher Education
Peer reviewedEaves, Ronald C.; Simpson, Robert G. – Psychology in the Schools, 1986
Contends that erroneous conclusions concerning intraindividual strengths may result when comparing scaled scores on subtests of The Test of Reading Comprehension. Examination of scaled scores may seem to indicate that a given student has performed better on one subtest than on another when the difference between the two scores is not statistically…
Descriptors: Academic Ability, Comparative Analysis, Elementary Education, Elementary School Students
Peer reviewedHanania, Edith; Shikhani, May – TESOL Quarterly, 1986
Describes a study of the interrelationships among three tests (a standardized English-as-a-second-language test, a cloze test, and a written composition test) which sought to determine whether adding the cloze test to the ESL test would improve the predictability of students' communicative proficiency, as reflected in their writing test…
Descriptors: Cloze Procedure, English (Second Language), Higher Education, Language Proficiency
Peer reviewedPeterson, Donovan – Educational Research Quarterly, 1986
This article describes procedures to be followed in developing a system for observation of teachers in the classroom and use of the observation to evaluate teachers. A list of criteria is presented, including various types of validity, measurement characteristics, and practicality characteristics for observation systems. (Author/LMO)
Descriptors: Achievement Gains, Classroom Observation Techniques, Educational Assessment, Elementary Secondary Education
Peer reviewedHarrington, Robert G.; Jennings, Valerie – Contemporary Educational Psychology, 1986
Three short forms of the McCarthy Scales of Children's Abilities (MSCA) have been developed to screen the cognitive skills of young children suspected of learning disorders and developmental delays. Correlations were obtained between scores on the full form of the MSCA and the Kaufman, Taylor, and McCarthy Screening Test short forms. (Author/LMO)
Descriptors: Cognitive Tests, Comparative Testing, Correlation, Early Childhood Education
Peer reviewedMurphy, Kevin R.; And Others – Journal of Educational Psychology, 1984
Using 45 undergraduate evaluations of videotaped lectures, this study examined the effects of the purposes of rating on measures of accuracy in observing teacher behavior and in evaluating teacher performance. Results suggest that the purpose affects the way raters process behavioral information without necessarily affecting the general level of…
Descriptors: Behavior Rating Scales, Decision Making, Evaluation Utilization, Higher Education
Peer reviewedKoballa, Thomas R., Jr. – Journal of Research in Science Teaching, 1984
Reports on the development of a 19-item, Likert-type instrument that measures the attitudes of preservice and inservice teachers toward energy conservation. Focuses on the nine-step process used in developing the instrument. (JN)
Descriptors: Attitude Measures, Elementary Education, Energy Conservation, Inservice Teacher Education
Peer reviewedEdwards, Patricia K.; And Others – Evaluation Review, 1985
This study addresses four issues in the evaluation of nutrition education programs: (1) the reliability of knowledge, belief, and behavior scales; (2) the effectiveness of programs targeted to the general public; (3) the longitudinal effects of nutrition education interventions; and (4) the relationship between changes in the cognitive, belief,…
Descriptors: Behavior Change, Beliefs, Evaluation Methods, Federal Programs
Peer reviewedEdwards, Dee; Williams, David – British Journal of Educational Technology, 1985
Discusses process of continuous assessment that involves many tutors grading papers at Great Britain's Open University and the problem of grade unreliability within such a system. A longitudinal experiment involving grading of papers in one course by all tutors and comparing their grades to determine grading reliability is described. (MBR)
Descriptors: College Faculty, Correspondence Study, Distance Education, Evaluation Methods


