Publication Date
| In 2026 | 3 |
| Since 2025 | 656 |
| Since 2022 (last 5 years) | 3157 |
| Since 2017 (last 10 years) | 7398 |
| Since 2007 (last 20 years) | 15036 |
Descriptor
| Test Reliability | 15028 |
| Test Validity | 10265 |
| Reliability | 9757 |
| Foreign Countries | 7137 |
| Test Construction | 4821 |
| Validity | 4191 |
| Measures (Individuals) | 3876 |
| Factor Analysis | 3822 |
| Psychometrics | 3520 |
| Interrater Reliability | 3124 |
| Correlation | 3039 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1326 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 251 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 216 |
| California | 214 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Peer reviewedCuenot, Randall G.; Darbes, Alex – Educational and Psychological Measurement, 1982
Thirty-one clinical psychologists scored Comprehension, Similarities, and Vocabulary subtest items common to the Wechsler Intelligence Scale for Children (WISC) and the Wechsler Intelligence Scale for Children, Revised (WISC-R). The results on interrater scoring agreement suggest that the scoring of these subtests may be less subjective than…
Descriptors: Clinical Psychology, Intelligence Tests, Psychologists, Scoring
Peer reviewedShapiro, Alexander – Psychometrika, 1982
The extent to which one can reduce the rank of a symmetric matrix by only changing its diagonal entries is discussed. Extension of this work to minimum trace factor analysis is presented. (Author/JKS)
Descriptors: Data Analysis, Factor Analysis, Mathematical Models, Matrices
Peer reviewedGreen, Samuel B. – Educational and Psychological Measurement, 1981
The proportion of agreement, G, and kappa indexes are shown to differ in how they correct for chance agreements between two observers. On the basis of the findings, it is suggested that no single agreement index is appropriate for all sets of data. (Author/BW)
Descriptors: Comparative Analysis, Measurement Techniques, Test Reliability, Testing Problems
Decker, Robert L. – Personnel Administrator, 1981
The sole objective of the employment interview should be to obtain and evaluate factual and verifiable information. The greater the discrepancy between the tasks of the job and the experience of the interviewee, the more critical will be the influence of the intuitive judgment of the interviewer. (Author/MLF)
Descriptors: Employment Interviews, Employment Practices, Employment Qualifications, Reliability
Peer reviewedCardinet, Jean; And Others – Journal of Educational Measurement, 1981
Since fixed and random facets may exist in objects of study as well as in conditions of observation, various modifications of the generalizability theory estimation formulas are required for different types of measurement designs. Various design modifications are proposed to improve reliability by reducing error variance. (Author/BW)
Descriptors: Analysis of Variance, Reliability, Research Design, Statistical Analysis
Peer reviewedBergan, John R. – Journal of Educational Statistics, 1980
The use of a quasi-equiprobability model in the measurement of observer agreement involving dichotomous coding categories is described. A measure of agreement is presented which gives the probability of agreement under the assumption that observation pairs reflecting disagreement will be equally probable. (Author/JKS)
Descriptors: Judges, Mathematical Models, Observation, Probability
Peer reviewedGorsuch, Richard L. – Educational and Psychological Measurement, 1980
Kaiser and Michael reported a formula for factor scores giving an internal consistency reliability and its square root, the domain validity. Using this formula is inappropriate if variables are included which have trival weights rather than salient weights for the factor for which the score is being computed. (Author/RL)
Descriptors: Factor Analysis, Factor Structure, Scoring Formulas, Test Reliability
Peer reviewedNorris, Marylee; And Others – Journal of Speech and Hearing Disorders, 1980
The study reported differences in agreement among four experienced listeners who analyzed the articulation skills of 97 four- and five-year-old children. Place and manner of articulation revealed differences of agreement, whereas voicing and syllabic function contributed little to agreement or disagreement. (Author)
Descriptors: Articulation Impairments, Informal Assessment, Listening, Preschool Education
Peer reviewedFitzgerald, Gisela G. – Journal of Reading, 1981
Research indicates that three samples may not give a good indication of a workbook's narrative readability level. (MKM)
Descriptors: Elementary Secondary Education, Readability Formulas, Reading Research, Reliability
Bardo, John W.; Graney, Marshall J. – Southern Journal of Educational Research, 1979
Investigating use of maximum v averaged scores in physical and motor multiple-trial tests as indicators of performance, this article concludes use of mean scores is still most appropriate in terms of scientific estimations of true performance given multiple fallible empirical measures. (JC)
Descriptors: Performance, Psychomotor Skills, Reliability, Scores
Peer reviewedBradley, John M.; And Others – Journal of Reading Behavior, 1978
The present study was designed to determine if maze tests constructed over the same passages by different teachers were comparable. In addition, maze test parallel form reliability was investigated. (HOD)
Descriptors: Educational Research, Reading Comprehension, Reading Tests, Test Reliability
Peer reviewedHenggeler, Scott W.; Tavormina, Joseph B. – Hispanic Journal of Behavioral Sciences, 1979
The one-year stabilities of several well-standardized intellectual, educational, and personality tests were evaluated for 15 children of Mexican American migrant workers. Most of the stability coefficients observed for these tests were statistically significant and similar to those reported for their normative samples. However, the stability…
Descriptors: Mexican Americans, Migrant Children, Psychological Testing, Test Reliability
Peer reviewedJackson, Paul H. – Psychometrika, 1979
Use of the same term "split-half" for division of an n-item test into two subtests containing equal (Cronbach), and possibly unequal (Guttman), numbers of items sometimes leads to a misunderstanding about the relation between Guttman's maximum split-half bound and Cronbach's coefficient alpha. This distinction is clarified. (Author/JKS)
Descriptors: Item Analysis, Mathematical Formulas, Technical Reports, Test Reliability
Peer reviewedReynolds, Cecil R. – Psychology in the Schools, 1979
Two doctoral level school psychologists independently scored 50 McCarthy drawing booklets. Children producing the drawings ranged from 5-11. Interscorer reliability for Draw-A-Design was .93 and for Draw-A-Child was .96. No significant differences occurred in the mean score for either test across scores. (Author)
Descriptors: Children, Elementary Education, Scoring, Test Reliability
Peer reviewedPitts, Steven C.; And Others – Evaluation and Program Planning, 1996
An introduction is provided to the use of confirmatory factor analysis to test measurement invariance and stability in longitudinal research. The approach is illustrated through examples representing one or two constructs in one to three measurement waves. Basic issues in establishing measurement invariance are discussed. (SLD)
Descriptors: Evaluation Research, Longitudinal Studies, Measurement Techniques, Models


