Publication Date
| In 2026 | 7 |
| Since 2025 | 690 |
| Since 2022 (last 5 years) | 3191 |
| Since 2017 (last 10 years) | 7432 |
| Since 2007 (last 20 years) | 15070 |
Descriptor
| Test Reliability | 15055 |
| Test Validity | 10290 |
| Reliability | 9763 |
| Foreign Countries | 7150 |
| Test Construction | 4828 |
| Validity | 4192 |
| Measures (Individuals) | 3880 |
| Factor Analysis | 3826 |
| Psychometrics | 3532 |
| Interrater Reliability | 3126 |
| Correlation | 3040 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1329 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 253 |
| Taiwan | 234 |
| Netherlands | 224 |
| Spain | 218 |
| California | 215 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Zimmerman, Donald W. – J Exp Educ, 1969
Research supported by the National Research Council of Canada, Grant APA-252-2057-B.
Descriptors: Analysis of Variance, Mathematical Models, Scoring, Test Reliability
Oltman, Philip K. – Percept Mot Skills, 1969
Descriptors: Arousal Patterns, Auditory Stimuli, Test Reliability, Test Validity
Goldsamt, Milton G. – Percept Mot Skills, 1969
Descriptors: Adults, Intelligence, Psychological Testing, Test Reliability
Robertson, Gary J. – 1981
Some fundamental concepts of criterion referenced test (CRT) reliability are highlighted. Emphasis is given to the procedures for determining reliability of scores for individual pupils because this is an area requiring increased awareness by classroom teachers and practitioners. Reliability issues encountered in the evaluation of instructional…
Descriptors: Criterion Referenced Tests, Reading Tests, Scores, Test Reliability
Robinson, Lora; Seligman, Richard – 1968
Items for a morale scale were selected from Pace's College and University Environment Scales (CUES). The initial morale scale of 55 items was reduced to 22 items without substantially changing the dimension being measured. The scale discriminates among the 100 colleges in Pace's national sample, and its reliability is acceptable. The items-scale…
Descriptors: College Students, Measurement Instruments, Test Reliability, Test Validity
Collet, LeVerne S. – 1970
A critical review of systems of scoring multiple choice tests is presented and the superiority of a system based upon elimination method over one based upon the best answer mode is hypothesized. This is discussed in terms of the capacity of the mode to reveal the relationships among decoy options and the effects of partial information,…
Descriptors: Multiple Choice Tests, Scoring, Test Reliability, Test Validity
Kissel, Mary Ann – 1970
The problem of this study was to determine whether Method A is a more efficient observational method for obtaining activity type behaviors in an individualized classroom than Method B. Method A requires the observer to record the activities of the entire class at given intervals while Method B requires only the activities of selected individuals…
Descriptors: Classroom Observation Techniques, Individualized Instruction, Individualized Programs, Reliability
Hayes, Robert B. – 1968
This paper reports results of efforts over a 7-year period (1960-67) to determine if the Hayes Pupil-Teacher Reaction Scale is a reliable, valid unidimensional instrument which may be used to measure the attitude of students toward the teaching effectiveness of their teachers. Criteria used were 1) each respondent's total score describes with at…
Descriptors: Measurement Instruments, Reliability, Student Attitudes, Teacher Evaluation
Whalen, Thomas E. – 1971
Smith (1969) reported the results of an instrument for measuring teacher judgment of written composition. His test was first administered to a group of "experts" whose ratings were in high agreement. Then the test was given to a sample of over 200 teachers and lay readers. Among Smith's conclusions was that over half of the teachers have judgment…
Descriptors: Essay Tests, Reliability, Scoring, Test Validity
Wright, Lindsay G. – 1971
This paper presents an argument against traditional evaluation of students by examination and offers proposals for reform of the present system. Strengths and weaknesses of evaluation methods such as objective tests, use of the year's work, essay examinations, practical examinations, and oral examinations are discussed as well as the need for…
Descriptors: Evaluation, Higher Education, Student Evaluation, Test Reliability
Behm, Robert J.; Schill, William J.
A technique for assessing the agreement between the Q-sorts of two or more groups of subjects is presented which relies on the relationship between the Kendall coefficient of concordance (W) and the Spearman rank order correlation (rho). The proposed statistical treatment of Q-sort data involves the use of a number of intercorrelations rather than…
Descriptors: Correlation, Matrices, Q Methodology, Statistical Analysis
Peer reviewedBurns, Edward – Educational and Psychological Measurement, 1976
A computer program, written in Fortran IV, is described which assesses reliability by using analysis of variance. It produces a complete analysis of variance table in addition to reliability coefficients for unadjusted and adjusted data as well as the intraclass correlation for m subjects and n items. (Author)
Descriptors: Analysis of Variance, Computer Programs, Correlation, Test Reliability
Peer reviewedLarrabee, Marva J.; Froehle, Thomas C. – Counselor Education and Supervision, 1979
Demonstrates that differences occur in role fidelity and in the performance consistency of a coached client over a series of simulated interviews. Illustrates that such differences can be quantitatively described, and that the results of the frequency tabulation procedure are affected by the training of raters in component observation. (Author)
Descriptors: Modeling (Psychology), Observation, Performance Factors, Reliability
Peer reviewedShowalter, Stuart W. – Journalism Quarterly, 1978
Reports that the "Readers' Guide to Periodical Literature" provides quick access to popular magazine content, although the titles are not drawn randomly from a universe of publications; that the indexers take an inclusive approach to cataloging; and that the indexers demonstrate high reliability in locating and cataloging full-length…
Descriptors: Cataloging, Indexes, Indexing, Periodicals
Reliability and Mean Length of Utterance as a Function of Sample Size in Early Language Development.
Peer reviewedRondal, J. A.; DeFays, D. – Journal of Genetic Psychology, 1978
Recommends criteria for determining adequate sample size for the use of Mean Length of Utterance (MLU) as an indicator of early language development. (BD)
Descriptors: Infants, Language Acquisition, Reliability, Research Criteria


