Publication Date
| In 2026 | 3 |
| Since 2025 | 656 |
| Since 2022 (last 5 years) | 3157 |
| Since 2017 (last 10 years) | 7398 |
| Since 2007 (last 20 years) | 15036 |
Descriptor
| Test Reliability | 15028 |
| Test Validity | 10265 |
| Reliability | 9757 |
| Foreign Countries | 7137 |
| Test Construction | 4821 |
| Validity | 4191 |
| Measures (Individuals) | 3876 |
| Factor Analysis | 3822 |
| Psychometrics | 3520 |
| Interrater Reliability | 3124 |
| Correlation | 3039 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1326 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 251 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 216 |
| California | 214 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Peer reviewedCheong, George S. C. – Canadian Journal of Higher Education, 1979
Results of the study reported include indications that student evaluations of college instructors tend to be higher when not anonymous and that undergraduate students' evaluations of instructors are lower after course marks are received. Problems associated with the use of student evaluations of instructors are discussed. (JMD)
Descriptors: College Faculty, College Students, Educational Problems, Higher Education
Peer reviewedMarsh, Herbert W.; Overall, J. U. – Research in Higher Education, 1979
Results of a follow-up study of the same students used in the Feldman study indicate that individual student evaluations are remarkably stable over time and more reliable than previously assumed. There was systematic information in individual student ratings that internal consistency approaches have ignored or assumed to be nonexistent.…
Descriptors: College Faculty, College Students, Comparative Analysis, Course Evaluation
Peer reviewedDolmans, Diana H. J. M.; And Others – Academic Medicine, 1996
Examined the extent to which tutor ratings remained stable in the long term by evaluating 291 ratings of 140 tutors at Maastricht University in the Netherlands between 1992 and 1995. The results indicated that, if the aggregated score and overall judgement are used to interpret the precision of individual scores, four and two occasions,…
Descriptors: Faculty Evaluation, Foreign Countries, Generalizability Theory, Higher Education
Peer reviewedPatterson, Patricia; And Others – Research Quarterly for Exercise and Sport, 1996
This study examined the validity and reliability of the Back Saver Sit-and-Reach test for middle school students. Students completed the test during physical education class. Results indicated that the test was moderately related to hamstring flexibility, but its relationship to lower back flexibility was quite low for both sexes. (SM)
Descriptors: Intermediate Grades, Junior High School Students, Junior High Schools, Middle School Students
Peer reviewedOrsmond, Paul; Merry, Stephen; Reiling, Kevin – Assessment & Evaluation in Higher Education, 1997
Reports on a study of a student self-assessment method in college biology, comparing students' self-evaluation, students' peer evaluation, and the teacher's evaluation criteria. Results illustrate potential problems in making assumptions about student ability to self-evaluate but also support previous findings about the instructional usefulness of…
Descriptors: Biology, College Faculty, College Instruction, College Students
Peer reviewedGavin, William J.; Giles, Lisa – Journal of Speech and Hearing Research, 1996
This study examined the temporal reliability of four quantitative measurements of linguistic behaviors in 20 preschool children observed in a naturalistic setting. Although inadequate reliability was found for the measure which used total number of words, very high reliability coefficients were obtained for the measures which used number of…
Descriptors: Clinical Diagnosis, Diagnostic Tests, Educational Diagnosis, Evaluation Methods
Llewellyn, Nick – Aspects of Educational and Training Technology Series, 1992
Capability is an expert system designed to support the process of compiling portfolios for competence-based assessment, particularly accreditation of prior leaning. The system was designed to provide cost effective management, consistency of rigor, support for tutors in accreditation workshops, and a structured approach to counseling. Outlines…
Descriptors: Accreditation (Institutions), Competence, Cost Effectiveness, Decision Support Systems
Peer reviewedGullone, Eleonora; And Others – Research in Developmental Disabilities, 1996
This study compared psychometric results on the Fear Survey Schedule for Children-II for 187 children and adolescents with mental retardation and 372 intellectually average students. The schedule demonstrated sound psychometric properties for both samples. Mentally retarded subjects scored significantly higher than the comparison sample, and their…
Descriptors: Adolescents, Age Differences, Children, Elementary Secondary Education
Peer reviewedGurp, S. van – B.C. Journal of Special Education, 1996
This study evaluated the internal reliability and face validity of a linguistically modified Self-Description Questionnaire and a sign language video presentation of the questionnaire items with 10 deaf students (ages 8 to 13). Results suggest that the modified measure and video presentation are appropriate for use with deaf students without…
Descriptors: Deafness, Elementary Secondary Education, Measures (Individuals), Questioning Techniques
Peer reviewedSummers, Patricia A.; And Others – Language, Speech, and Hearing Services in Schools, 1996
Kindergarten children (n=101) were tested on the Bankson Language Test Second Edition and the Clinical Evaluation of Language Fundamentals Revised Screening Test and were given the tests again 7 months later. Results showed that the children scored higher on both tests at the second administration, without intervention from a speech-language…
Descriptors: Diagnostic Tests, Evaluation Methods, Kindergarten, Kindergarten Children
Peer reviewedHambleton, Ronald K.; Slater, Sharon C. – Applied Measurement in Education, 1997
A brief history of developments in the assessment of the reliability of credentialing examinations is presented, and some new results are outlined that highlight the interactions among scoring, standard setting, and the reliability and validity of pass-fail decisions. Decision consistency is an important concept in evaluating credentialing…
Descriptors: Certification, Credentials, Decision Making, Interaction
Peer reviewedSerrano, Elena; Anderson, Jennifer – Hispanic Journal of Behavioral Sciences, 2003
The Short Acculturation Scale for Hispanic Youth (SASH-Y) was used to assess acculturation among 137 fourth- and fifth-grade children in rural southern Colorado, including 11 Mexican, 33 Mexican American, and 93 Euro-American children. The SASH-Y, especially questions related to language use, was found to be robust with a young, rural Latino…
Descriptors: Acculturation, Elementary School Students, Ethnicity, Hispanic American Students
Peer reviewedBeail, Nigel – Mental Retardation, 2003
This article discusses the advantages and disadvantages of the Vineland Adaptive Behavior Scales for measuring adaptive behavior in adults with mental retardation. It concludes that the advantages of the coverage of the main domains of adaptive behavior, their standardization, impressive psychometrics, and brevity are becoming outweighed by…
Descriptors: Adaptive Behavior (of Disabled), Adult Education, Adults, Behavior Rating Scales
Peer reviewedLarson, Reed W.; Moneta, Giovanni; Richards, Maryse H.; Wilson, Suzanne – Child Development, 2002
This longitudinal study examined change in 220 adolescents' daily range of emotional states between early and late adolescence. Findings showed that emotional states became less positive across early adolescence; this downward change in average emotions ceased in grade 10. The greatest relative instability was during early adolescence; stability…
Descriptors: Adolescent Attitudes, Adolescent Development, Adolescents, Affective Behavior
Peer reviewedOrsmond, Paul; And Others – Assessment & Evaluation in Higher Education, 1996
A study comparing peer and teacher evaluations of British university biology students' (n=39) performance found such comparison misleading as a guide to the validity of peer assessment. When individual criteria were analyzed, agreement of peers and teacher ranged from 31-62%, with specific areas of the criteria prone to over- and undervaluation.…
Descriptors: Bias, Biology, College Students, Comparative Analysis


