Publication Date
| In 2026 | 3 |
| Since 2025 | 636 |
| Since 2022 (last 5 years) | 3137 |
| Since 2017 (last 10 years) | 7378 |
| Since 2007 (last 20 years) | 15016 |
Descriptor
| Test Reliability | 15015 |
| Test Validity | 10252 |
| Reliability | 9751 |
| Foreign Countries | 7126 |
| Test Construction | 4811 |
| Validity | 4189 |
| Measures (Individuals) | 3875 |
| Factor Analysis | 3821 |
| Psychometrics | 3515 |
| Interrater Reliability | 3122 |
| Correlation | 3037 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1320 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 251 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 216 |
| California | 214 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Peer reviewedBergan, John R. – Journal of Educational Statistics, 1980
The use of a quasi-equiprobability model in the measurement of observer agreement involving dichotomous coding categories is described. A measure of agreement is presented which gives the probability of agreement under the assumption that observation pairs reflecting disagreement will be equally probable. (Author/JKS)
Descriptors: Judges, Mathematical Models, Observation, Probability
Peer reviewedGorsuch, Richard L. – Educational and Psychological Measurement, 1980
Kaiser and Michael reported a formula for factor scores giving an internal consistency reliability and its square root, the domain validity. Using this formula is inappropriate if variables are included which have trival weights rather than salient weights for the factor for which the score is being computed. (Author/RL)
Descriptors: Factor Analysis, Factor Structure, Scoring Formulas, Test Reliability
Peer reviewedNorris, Marylee; And Others – Journal of Speech and Hearing Disorders, 1980
The study reported differences in agreement among four experienced listeners who analyzed the articulation skills of 97 four- and five-year-old children. Place and manner of articulation revealed differences of agreement, whereas voicing and syllabic function contributed little to agreement or disagreement. (Author)
Descriptors: Articulation Impairments, Informal Assessment, Listening, Preschool Education
Peer reviewedFitzgerald, Gisela G. – Journal of Reading, 1981
Research indicates that three samples may not give a good indication of a workbook's narrative readability level. (MKM)
Descriptors: Elementary Secondary Education, Readability Formulas, Reading Research, Reliability
Bardo, John W.; Graney, Marshall J. – Southern Journal of Educational Research, 1979
Investigating use of maximum v averaged scores in physical and motor multiple-trial tests as indicators of performance, this article concludes use of mean scores is still most appropriate in terms of scientific estimations of true performance given multiple fallible empirical measures. (JC)
Descriptors: Performance, Psychomotor Skills, Reliability, Scores
Peer reviewedBradley, John M.; And Others – Journal of Reading Behavior, 1978
The present study was designed to determine if maze tests constructed over the same passages by different teachers were comparable. In addition, maze test parallel form reliability was investigated. (HOD)
Descriptors: Educational Research, Reading Comprehension, Reading Tests, Test Reliability
Peer reviewedHenggeler, Scott W.; Tavormina, Joseph B. – Hispanic Journal of Behavioral Sciences, 1979
The one-year stabilities of several well-standardized intellectual, educational, and personality tests were evaluated for 15 children of Mexican American migrant workers. Most of the stability coefficients observed for these tests were statistically significant and similar to those reported for their normative samples. However, the stability…
Descriptors: Mexican Americans, Migrant Children, Psychological Testing, Test Reliability
Peer reviewedJackson, Paul H. – Psychometrika, 1979
Use of the same term "split-half" for division of an n-item test into two subtests containing equal (Cronbach), and possibly unequal (Guttman), numbers of items sometimes leads to a misunderstanding about the relation between Guttman's maximum split-half bound and Cronbach's coefficient alpha. This distinction is clarified. (Author/JKS)
Descriptors: Item Analysis, Mathematical Formulas, Technical Reports, Test Reliability
Peer reviewedReynolds, Cecil R. – Psychology in the Schools, 1979
Two doctoral level school psychologists independently scored 50 McCarthy drawing booklets. Children producing the drawings ranged from 5-11. Interscorer reliability for Draw-A-Design was .93 and for Draw-A-Child was .96. No significant differences occurred in the mean score for either test across scores. (Author)
Descriptors: Children, Elementary Education, Scoring, Test Reliability
Peer reviewedPitts, Steven C.; And Others – Evaluation and Program Planning, 1996
An introduction is provided to the use of confirmatory factor analysis to test measurement invariance and stability in longitudinal research. The approach is illustrated through examples representing one or two constructs in one to three measurement waves. Basic issues in establishing measurement invariance are discussed. (SLD)
Descriptors: Evaluation Research, Longitudinal Studies, Measurement Techniques, Models
Peer reviewedTrimble, Douglas E. – Educational and Psychological Measurement, 1997
Studies of the reliability and validity of scores on the Religious Orientation Scale (G. Allport and J. Ross, 1967) were reviewed with respect to social desirability. Meta analysis shows that one scale correlates with social desirability, but another does not, suggesting that partialing out this variance is not recommended. (SLD)
Descriptors: Correlation, Meta Analysis, Reliability, Scores
Peer reviewedWolfe, Edward W.; Nogle, Sally – Journal of Applied Measurement, 2002
Developed and validated an instrument designed to measure the perceived measurability and importance of the National Athletic Trainers' Association Athletic Training Educational Competencies. Data from 931 athletic trainers and sport medicine physicians support 6 constructs, each of which demonstrates high reliability. (SLD)
Descriptors: Athletics, Competence, Criteria, Measurement Techniques
Peer reviewedReese Robert J.; Kieffer, Kevin M.; Briggs, Barbara K. – Educational and Psychological Measurement, 2002
Conducted a reliability generalization study of five of the most prominent adult attachment style measures. Results from this investigation of 154 previously published stories indicate that the average score reliabilities across studied varied considerably across instruments and subscales. (SLD)
Descriptors: Adults, Attachment Behavior, Generalization, Meta Analysis
Peer reviewedHenson, Robin K.; Hwang, Dae-Yeop – Educational and Psychological Measurement, 2002
Conducted a reliability generalization study of Kolb's Learning Style Inventory (LSI; D. Kolb, 1976). Results for 34 studies indicate that internal consistency and test-retest reliabilities for LSI scores fluctuate considerably and contribute to deleterious cumulative measurement error. (SLD)
Descriptors: Error of Measurement, Generalization, Meta Analysis, Reliability
Peer reviewedDimitrov, Dimiter M. – Educational and Psychological Measurement, 2002
Discusses reliability issues in light of recent studies and debates focused on psychometrics versus datametrics terminology and reliabilities generalization. Discusses the way multiple perspectives on score reliability may affect research practice, editorial policies, and reliability generalization across studies. (SLD)
Descriptors: Generalization, Meta Analysis, Psychometrics, Reliability


