Publication Date
| In 2026 | 7 |
| Since 2025 | 690 |
| Since 2022 (last 5 years) | 3191 |
| Since 2017 (last 10 years) | 7432 |
| Since 2007 (last 20 years) | 15070 |
Descriptor
| Test Reliability | 15055 |
| Test Validity | 10290 |
| Reliability | 9763 |
| Foreign Countries | 7150 |
| Test Construction | 4828 |
| Validity | 4192 |
| Measures (Individuals) | 3880 |
| Factor Analysis | 3826 |
| Psychometrics | 3532 |
| Interrater Reliability | 3126 |
| Correlation | 3040 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1329 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 253 |
| Taiwan | 234 |
| Netherlands | 224 |
| Spain | 218 |
| California | 215 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Peer reviewedGoodstein, H. A. – Journal of Special Education, 1982
A review of alternative methodologies and a conceptual framework for the study of reliability of criterion-referenced tests are presented. The possibility of aptitude-x-assessment interactions is considered and implications are discussed. (Author)
Descriptors: Criterion Referenced Tests, Disabilities, Elementary Secondary Education, Research Methodology
Peer reviewedPersson-Blennow, Inger; McNeil, Thomas F. – Journal of Child Psychology and Psychiatry and Allied Disciplines, 1982
New data on retest reliabilities for three parental questionnaires, designed to measure children's temperaments at six months and at one and two years of age, were obtained with independent samples. The results showed a general level of retest reliability which was comparable to that of most temperament studies. (Author/RH)
Descriptors: Adolescents, Children, Foreign Countries, Males
Peer reviewedLubin, Bernard – Journal of Clinical Psychology, 1981
Determined additional data on the reliability and validity of brief versions of the Depression Adjective Checklist by reanalyzing the data for Forms E, F, and G. Suggests the brevity and relatively high intercorrelation among the lists are assets when repeated measures are needed or many variables are measured. (Author/JAC)
Descriptors: Adults, Depression (Psychology), Personality Measures, Psychological Evaluation
Peer reviewedKeenan, Donna – Reading Horizons, 1982
Suggests that the popular and simple readability formulas may not be accurate enough to predict the instructional materials best suited to the reading abilities of secondary school students. (FL)
Descriptors: Readability Formulas, Reading Research, Secondary Education, Test Reliability
Peer reviewedWeber, Ronald L. – Journal of Learning Disabilities, 1982
Three measures often used with handicapped children (the Berry-Talbott Comprehension of Grammar, the Grammatic Closure subtest of the Illinois Test of Psycholinguistic Abilities, and the Grammatic Completion subtest of the Test of Language Development) are discussed in terms of test reliability, scoring procedures, format, and types of scores.…
Descriptors: Disabilities, Language Tests, Morphology (Languages), Nonstandard Dialects
Peer reviewedConger, Anthony J.; And Others – Applied Psychological Measurement, 1979
The WISC-R was investigated by using measures of profile (multivariate) reliability to determine its most reliable dimensions and the precision and similarity of the multivariate structure across age groups. The structure of the WISC-R subscales was stable across age groups. Two strategies for the interpretation of WISC-R profiles are offered.…
Descriptors: Age Differences, Elementary Secondary Education, Factor Structure, Intelligence
Peer reviewedNaglieri, Jack A.; Maxwell, Susanna – Perceptual and Motor Skills, 1981
Inter-rater reliability of the Goodenough-Harris and McCarthy Draw-A-Child scoring systems was examined for a sample of 60 children, including 20 school-labeled learning disabled, 20 mentally retarded, and 20 normal children between the ages of six and eight-and-one-half years. (Author)
Descriptors: Correlation, Intelligence Tests, Learning Disabilities, Mental Retardation
Peer reviewedWalters, Lynda Henley; Klein, Alice E. – Educational and Psychological Measurement, 1980
Underlying constructs of the Nowicki-Strickland Locus of Control Scale for Children (NSLOCSC) were identified through factor analysis and cross-validated with two similar samples of high school students. The two resultant dimensions appeared to measure Social Control (six items) and Self Control (two items). (Author/GK)
Descriptors: Factor Analysis, Factor Structure, High Schools, Locus of Control
Peer reviewedLicata, Joseph W.; Norman, Reuben L. – Education, 1979
To test the general reliability and validity of the Triangulation Interview Form (TIF), 52 observers viewed a videotape simulation of an interview situation. Agreement among observers for each of 16 TIF questions ranged from 85 to 98 percent. Observers significantly discriminated between eight behaviors judged complete and eight behaviors judged…
Descriptors: Administrators, Behavioral Objectives, Competency Based Education, Field Experience Programs
Peer reviewedBrown, James Dean – Modern Language Journal, 1980
Describes study comparing merits of exact answer, acceptable answer, clozentropy and multiple choice methods for scoring tests. Results show differences among reliability, mean item facility, discrimination and usability, but not validity. (BK)
Descriptors: Cloze Procedure, English (Second Language), Scoring, Second Language Learning
Peer reviewedHattie, J.; Watkins, D. – British Journal of Educational Psychology, 1981
The Internal Structure of the New Study Processes Questionnaire (Biggs, 1979) was investigated with samples of Australian and Filipino university students. The internal consistency reliabilities, item and subscale factor analysis, were quite favorable for the Australian sample; however, results indicated that the instrument may not be suitable for…
Descriptors: College Students, Cross Cultural Studies, Factor Structure, Student Motivation
Peer reviewedMardell-Czudnowski, Carol D. – Journal for Special Educators, 1980
The article describes the preschool screening test, "Developmental Indicators for the Assessment of Learning" (DIAL), and reviews research findings on the test from 1973 through 1978. It is concluded that the DIAL is maintaining relatively high levels of criterion-related validity (both concurrent and predictive) when compared to other…
Descriptors: Disabilities, Predictive Validity, Preschool Education, Preschool Tests
Peer reviewedSandoval, Jonathan – Journal of Abnormal Child Psychology, 1981
The object of the study was to investigate the effect of differences in format on the precision of teacher ratings and thus on the reliability and validity of two teacher rating scales of children's hyperactive behavior. Attributes assessed were motor restlssness, inattentiveness, impulsivity, and aggressiveness/emotional stability. (Author/DB)
Descriptors: Behavior Rating Scales, Elementary Secondary Education, Hyperactivity, Test Format
Peer reviewedReynolds, William M.; Gould, Jonathan W. – Journal of Consulting and Clinical Psychology, 1981
Investigated the reliability, validity, and factor structure of the standard 21-item and short 13-item forms of the Beck Depression Inventory. The sample consisted of 163 participants in a methadone maintenance program. Results support the use of the short form as a reliable and valid brief screening measure of depression. (Author)
Descriptors: Adults, Depression (Psychology), Factor Analysis, Measures (Individuals)
Peer reviewedMurphy, R. J. L. – British Journal of Educational Psychology, 1979
Two senior GCE examiners re-marked photocopies of the same 200 GCE examination scripts, half still containing the marks and comments of the original examiners and half with these markings removed. Removing previous markings made a considerable difference to the extent of agreement between these sets of marks. (Editor/SJL)
Descriptors: Essay Tests, Examiners, Grading, Reliability


