Publication Date
| In 2026 | 3 |
| Since 2025 | 666 |
| Since 2022 (last 5 years) | 3167 |
| Since 2017 (last 10 years) | 7408 |
| Since 2007 (last 20 years) | 15046 |
Descriptor
| Test Reliability | 15036 |
| Test Validity | 10272 |
| Reliability | 9759 |
| Foreign Countries | 7141 |
| Test Construction | 4823 |
| Validity | 4191 |
| Measures (Individuals) | 3877 |
| Factor Analysis | 3825 |
| Psychometrics | 3525 |
| Interrater Reliability | 3124 |
| Correlation | 3039 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1327 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 252 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 216 |
| California | 214 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Peer reviewedRoot, Lawrence S. – Research in Higher Education, 1987
The assessments of faculty performance for the determination of salary increases are analyzed to estimate interrater reliability. Using the independent ratings by six elected members of the faculty, correlations between the ratings were calculated and estimates of the reliability of the composite ratings were generated. (Author/MLW)
Descriptors: College Faculty, College Instruction, Committees, Faculty Evaluation
A Longitudinal Study of the Wechsler Intelligence Scale for Children-Revised over a Six-Year Period.
Peer reviewedVance, Booney; And Others – Psychology in the Schools, 1987
Investigated stability of the Wechsler Intelligence Scale for Children-Revised (WISC-R) intelligence quotient scores of 32 exceptional students over six-year interval. Used 20 learning disabled and 12 mentally disabled students aged 6 to 16. Test-retest findings indicated median reliability value of .74. Discusses implications for clinicians and…
Descriptors: Adolescents, Children, Elementary Secondary Education, Emotional Disturbances
Peer reviewedBonzi, Susan – Journal of Documentation, 1984
Tested the hypothesis that the vocabulary of a discipline emphasizing concrete phenomena will have fewer synonyms per concept than vocabulary of a discipline emphasizing abstract phenomena. Although concreteness and abstractness of a discipline were found to be contributing factors in terminological consistency, at least one other factor exerts…
Descriptors: Abstracts, Behavioral Sciences, Biological Sciences, Intellectual Disciplines
Peer reviewedFrary, Robert B. – Journal of Educational Measurement, 1985
Responses to a sample test were simulated for examinees under free-response and multiple-choice formats. Test score sets were correlated with randomly generated sets of unit-normal measures. The extent of superiority of free response tests was sufficiently small so that other considerations might justifiably dictate format choice. (Author/DWH)
Descriptors: Comparative Analysis, Computer Simulation, Essay Tests, Guessing (Tests)
Peer reviewedSijtsma, Klaas; Molenaar, Ivo W. – Psychometrika, 1987
Three methods for estimating reliability are studied within the context of nonparametric item response theory. Two were proposed originally by Mokken and a third is developed in this paper. Using a Monte Carlo strategy, these three estimation methods are compared with four "classical" lower bounds to reliability. (Author/JAZ)
Descriptors: Estimation (Mathematics), Latent Trait Theory, Measurement Techniques, Monte Carlo Methods
Peer reviewedSimonson, Michael R.; And Others – Journal of Educational Computing Research, 1987
Describes the process used to develop two examinations, an achievement test of computer literacy and a computer anxiety index. Highlights include a definition of computer literacy, determination of the validity and reliability of the tests, and a study to evaluate the final versions of the tests. (Author/LRW)
Descriptors: Achievement Tests, Computer Assisted Instruction, Computer Literacy, Correlation
Peer reviewedReynolds, William M.; Baker, Jean A. – American Journal of Mental Retardation, 1988
The Self-Report Depression Questionnaire (SRDQ), a measure of depressive symptomatology in persons with mental retardation, was administered to 89 mentally retarded adults living in community-based settings. The SRDQ demonstrated high internal consistency reliability, as well as moderate stability over an 11-week period. Content validity and…
Descriptors: Adults, Community Programs, Depression (Psychology), Evaluation Methods
Peer reviewedHildebrand, Myrene; Hoover, H. D. – Educational and Psychological Measurement, 1987
Reliability and validity of "Degrees of Reading Power" test and "Iowa Tests of Basic Skills" reading comprehension and vocabulary tests were compared. Test scores, grades, and assigned reading levels of 191 fifth and sixth graders and 186 sixth and seventh graders in an eastern Iowa school district were used. Reliability and…
Descriptors: Comparative Analysis, Elementary School Students, Grade 5, Grade 6
Peer reviewedValencia, Sheila; Pearson, P. David – Reading Teacher, 1987
Argues that the tests used to measure reading achievement do not reflect recent advances in the understanding of the reading process, and that effective instruction best can be fostered by resolving the discrepancy between what is known and what is measured. (FL)
Descriptors: Elementary Education, Reading Achievement, Reading Comprehension, Reading Instruction
Peer reviewedBannister, Brendan D.; And Others – Educational and Psychological Measurement, 1987
To control for response bias in student ratings of college teachers, an index of rater error was used that was theoretically independent of actual performance. Partialing out the effects of this extraneous response bias enhanced validity, but partialing out overall effectiveness resulted in reduced convergent and discriminant validities.…
Descriptors: Error of Measurement, Higher Education, Interrater Reliability, Response Style (Tests)
Peer reviewedCooke, Robert A.; And Others – Educational and Psychological Measurement, 1987
Lafferty's Life Styles Inventory was completed by 556 managers (Level I, Self-Description) and by 2,922 peers, subordinates, and supervisors (Level II, Description by Others). Factor analysis revealed the same three factors in both ratings. Coworkers generally agreed with each others' ratings, but correlations between self and coworker ratings…
Descriptors: Administrator Evaluation, Adults, Behavior Rating Scales, Cognitive Style
Peer reviewedCruse, Daniel B. – Higher Education, 1987
A discussion of student ratings of faculty looks at problems in measurement and interpretation, including such factors as the ability of raters to evaluate complex behaviors, the salience of personal characteristics in judging task performance, and emphasis on the students as consumer. (Author/MSE)
Descriptors: College Faculty, Faculty Evaluation, Higher Education, Personality Traits
Peer reviewedBliss, Leonard B.; Mueller, Richard J. – Journal of Developmental Education, 1987
Describes the properties of the Study Behaviors Inventory, a multiple-response instrument designed to assess college students' study behaviors. Reviews the findings of a validation study, considering reliability and psychometric test properties. Discusses the difference between study skills and behaviors and the implications of the test for…
Descriptors: Behavior Rating Scales, College Students, Higher Education, Remedial Instruction
Peer reviewedWineburg, Samuel S. – Educational Researcher, 1987
There is not much evidence that supports the notion of the self-fulfilling prophecy as applied to teachers' expectations of students. The "Pygmalion" study is often cited by courts and in the media; but it is highly criticized in scholarly circles, and its findings could not be replicated. (VM)
Descriptors: Behavioral Science Research, Educational Research, Expectation, Interpersonal Relationship
Markham, Paul Leon – IRAL, 1988
A study of three different applications of cloze procedure to test reading comprehension in college students of German as a second language suggests that the procedure may not yield a valid or reliable assessment of global comprehension in second-language learning. (Author/MSE)
Descriptors: Cloze Procedure, College Students, German, Higher Education


