Publication Date
| In 2026 | 3 |
| Since 2025 | 656 |
| Since 2022 (last 5 years) | 3157 |
| Since 2017 (last 10 years) | 7398 |
| Since 2007 (last 20 years) | 15036 |
Descriptor
| Test Reliability | 15028 |
| Test Validity | 10265 |
| Reliability | 9757 |
| Foreign Countries | 7137 |
| Test Construction | 4821 |
| Validity | 4191 |
| Measures (Individuals) | 3876 |
| Factor Analysis | 3822 |
| Psychometrics | 3520 |
| Interrater Reliability | 3124 |
| Correlation | 3039 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1326 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 251 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 216 |
| California | 214 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Ibe, Milagros D. – RELC Journal, 1975
This investigation seeks to determine: the validity and reliability of cloze tests for measuring reading comprehension; the relation of cloze scores to difficulty levels of reading passages; the relation of cloze test performance to length of English training; and the merits of judgmental vs. random word deletion in test construction. (DB)
Descriptors: Cloze Procedure, English (Second Language), Indochinese, Language Teachers
Peer reviewedReker, Gary T. – Journal of Clinical Psychology, 1977
This research assessed the reliability and validity of the Purpose in Life test in an inmate population, investigated the relationship between the PIL test and attitudes, locus of control, personality factors, and several demographic variables, and compared the PIL scores of inmates with scores of normal samples. (Author/RK)
Descriptors: Demography, Individual Characteristics, Locus of Control, Measurement Instruments
Peer reviewedSchmidt, Frank L.; And Others – Personnel Psychology, 1977
The adverse impact of a content-valid job sample test of metal trades skills was compared to that of a well-constructed content-valid written achievement test for the same technical area. The adverse impact of the former was considerably less. Suggests that industrial psychologists should explore more fully the potential of performance testing.…
Descriptors: Majority Attitudes, Minority Groups, Performance Tests, Research Design
Peer reviewedHirshoren, Alfred; And Others – Psychology in the Schools, 1977
The Performance Scale of the WISC-R was administered to 59 prelingually deaf children attending a state-supported day school program. The results compare favorably with those found by Wechsler with the standardization sample. (Author)
Descriptors: Deafness, Exceptional Child Research, Group Testing, Intelligence Tests
Peer reviewedNorcinin, John J.; And Others – Journal of Medical Education, 1987
A study of the correlation between certification test results and ratings of clinical competence for graduate medical students in internal medicine during a six-year period found strong correlations on both individual and general indicators of competence. (MSE)
Descriptors: Certification, Competence, Graduate Medical Students, Higher Education
Peer reviewedWoodburn, Mary Stuart – Reading Teacher, 1986
Concludes that the test has a well-designed reading booklet and a carefully constructed manual, but that it has a narrow applicability. (FL)
Descriptors: Elementary Secondary Education, Oral Reading, Reading Achievement, Reading Diagnosis
Peer reviewedAleamoni, Lawrence M. – New Directions for Teaching and Learning, 1987
Eight of the most common faculty concerns about student evaluations of instruction are discussed: inconsistent student judgments, the perception that only colleagues are qualified to evaluate peers' instruction, student-rating schemes as popularity contests, unreliable and invalid student-rating forms, etc. Research shows that faculty concerns are…
Descriptors: College Faculty, College Instruction, Educational Research, Faculty Evaluation
Peer reviewedRoot, Lawrence S. – Research in Higher Education, 1987
The assessments of faculty performance for the determination of salary increases are analyzed to estimate interrater reliability. Using the independent ratings by six elected members of the faculty, correlations between the ratings were calculated and estimates of the reliability of the composite ratings were generated. (Author/MLW)
Descriptors: College Faculty, College Instruction, Committees, Faculty Evaluation
A Longitudinal Study of the Wechsler Intelligence Scale for Children-Revised over a Six-Year Period.
Peer reviewedVance, Booney; And Others – Psychology in the Schools, 1987
Investigated stability of the Wechsler Intelligence Scale for Children-Revised (WISC-R) intelligence quotient scores of 32 exceptional students over six-year interval. Used 20 learning disabled and 12 mentally disabled students aged 6 to 16. Test-retest findings indicated median reliability value of .74. Discusses implications for clinicians and…
Descriptors: Adolescents, Children, Elementary Secondary Education, Emotional Disturbances
Peer reviewedBonzi, Susan – Journal of Documentation, 1984
Tested the hypothesis that the vocabulary of a discipline emphasizing concrete phenomena will have fewer synonyms per concept than vocabulary of a discipline emphasizing abstract phenomena. Although concreteness and abstractness of a discipline were found to be contributing factors in terminological consistency, at least one other factor exerts…
Descriptors: Abstracts, Behavioral Sciences, Biological Sciences, Intellectual Disciplines
Peer reviewedFrary, Robert B. – Journal of Educational Measurement, 1985
Responses to a sample test were simulated for examinees under free-response and multiple-choice formats. Test score sets were correlated with randomly generated sets of unit-normal measures. The extent of superiority of free response tests was sufficiently small so that other considerations might justifiably dictate format choice. (Author/DWH)
Descriptors: Comparative Analysis, Computer Simulation, Essay Tests, Guessing (Tests)
Peer reviewedSijtsma, Klaas; Molenaar, Ivo W. – Psychometrika, 1987
Three methods for estimating reliability are studied within the context of nonparametric item response theory. Two were proposed originally by Mokken and a third is developed in this paper. Using a Monte Carlo strategy, these three estimation methods are compared with four "classical" lower bounds to reliability. (Author/JAZ)
Descriptors: Estimation (Mathematics), Latent Trait Theory, Measurement Techniques, Monte Carlo Methods
Peer reviewedSimonson, Michael R.; And Others – Journal of Educational Computing Research, 1987
Describes the process used to develop two examinations, an achievement test of computer literacy and a computer anxiety index. Highlights include a definition of computer literacy, determination of the validity and reliability of the tests, and a study to evaluate the final versions of the tests. (Author/LRW)
Descriptors: Achievement Tests, Computer Assisted Instruction, Computer Literacy, Correlation
Peer reviewedReynolds, William M.; Baker, Jean A. – American Journal of Mental Retardation, 1988
The Self-Report Depression Questionnaire (SRDQ), a measure of depressive symptomatology in persons with mental retardation, was administered to 89 mentally retarded adults living in community-based settings. The SRDQ demonstrated high internal consistency reliability, as well as moderate stability over an 11-week period. Content validity and…
Descriptors: Adults, Community Programs, Depression (Psychology), Evaluation Methods
Peer reviewedHildebrand, Myrene; Hoover, H. D. – Educational and Psychological Measurement, 1987
Reliability and validity of "Degrees of Reading Power" test and "Iowa Tests of Basic Skills" reading comprehension and vocabulary tests were compared. Test scores, grades, and assigned reading levels of 191 fifth and sixth graders and 186 sixth and seventh graders in an eastern Iowa school district were used. Reliability and…
Descriptors: Comparative Analysis, Elementary School Students, Grade 5, Grade 6


