Publication Date
| In 2026 | 6 |
| Since 2025 | 481 |
| Since 2022 (last 5 years) | 1960 |
| Since 2017 (last 10 years) | 4532 |
| Since 2007 (last 20 years) | 7017 |
Descriptor
| Test Reliability | 15055 |
| Test Validity | 10022 |
| Test Construction | 4374 |
| Foreign Countries | 3840 |
| Psychometrics | 2435 |
| Factor Analysis | 2302 |
| Measures (Individuals) | 1787 |
| Evaluation Methods | 1410 |
| Higher Education | 1391 |
| Questionnaires | 1264 |
| Factor Structure | 1249 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 454 |
| Practitioners | 319 |
| Teachers | 128 |
| Administrators | 73 |
| Policymakers | 33 |
| Counselors | 31 |
| Students | 17 |
| Parents | 10 |
| Community | 6 |
| Support Staff | 5 |
Location
| Turkey | 840 |
| Australia | 239 |
| China | 211 |
| Canada | 207 |
| Indonesia | 163 |
| Spain | 131 |
| United States | 123 |
| United Kingdom | 121 |
| Germany | 112 |
| Taiwan | 108 |
| Netherlands | 103 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 2 |
| Meets WWC Standards with or without Reservations | 2 |
| Does not meet standards | 1 |
Jackson, Christine; Levine, Douglas W. – 1983
This study assessed the interchangeability of the Matthews Youth Test for Health (MYTH) and Hunter-Wolf A-B Rating Scale. Data from 25 elementary teachers and 300 of their students showed these scales to be weakly correlated, and the concordance of their A-B classifications to be only slightly above that expected by chance. Weak agreement was…
Descriptors: Behavior Rating Scales, Correlation, Elementary Education, High Schools
Christine, Charles T.; And Others – 1982
Thirty-two children aged 7 to 12 participated in a study to determine the reliability of the Ekwall Reading Inventory (ERI) and the Classroom Reading Inventory (CRI). The children were randomly assigned to take one of the two inventories, which were administered by four different specially trained teachers. The study used a test-retest design, in…
Descriptors: Comparative Analysis, Elementary Secondary Education, Informal Reading Inventories, Interrater Reliability
Deno, Stanley L.; And Others – 1983
Using instructional variables identified by the literature as important in predicting classroom achievement, a bi-polar rating scale was designed to assess the structure of instruction in resource rooms. The data for 158 elementary school children in four school districts were analyzed. The scale evidenced good reliability, both in terms of…
Descriptors: Academic Achievement, Classroom Environment, Elementary Education, Factor Structure
Szapocznik, Jose; And Others – 1987
Research showing psychodynamic child therapy to be less effective than other forms of child treatment have used outcome measures focusing on symptomatic and behavioral change rather than on psychodynamic processes. A child therapy assessment procedure than measures the psychological functioning of the child in a psychodynamically meaningful way is…
Descriptors: Child Development, Children, Counseling Effectiveness, Evaluation Methods
Peer reviewedMarsh, Herbert W. – International Journal of Educational Research, 1987
The reliability, long-term stability, and generalizability of student ratings of teacher effectiveness are discussed. The Students' Evaluation of Educational Quality (SEEQ) instrument is examined from these perspectives. The multidimensionality of student response to such evaluation instruments must be recognized. (SLD)
Descriptors: College Students, Generalizability Theory, Interrater Reliability, Postsecondary Education
Peer reviewedHogan, Andrew – Evaluation and the Health Professions, 1986
This study derives the economic costs of misclassification in nursing home patient classification systems. These costs are then used as weights to estimate the reliability of a functional assessment instrument. Results suggest that reliability must be redefined and remeasured with each substantively new application of an assessment instrument.…
Descriptors: Classification, Correlation, Cost Effectiveness, Diagnostic Tests
Peer reviewedEpstein, Michael H.; Nieminen, Gayla S. – School Psychology Review, 1983
Teachers and classroom aides of learning disabled pupils were asked to complete the Conners Abbreviated Teacher Rating Scale (CATRS) on two separate occasions, one month apart. Inter-rater reliability for teachers (.866) and for aides (.602), and reliability across time for teachers (.866) and aides (.603) achieved acceptable levels. (Author/BW)
Descriptors: Elementary Education, Elementary School Teachers, Hyperactivity, Interrater Reliability
Peer reviewedHansford, B. C.; Hattie, J. A. – Review of Educational Research, 1982
A meta-analysis of 128 studies of the relationship between self and achievement/performance measures reported correlations in the range of -.77 to .96 with an "average" correlation of .21. This average relationship was modified by several variables, including, among others, grade level of subjects, socioeconomic status, ethnicity, and…
Descriptors: Achievement Tests, Correlation, Educational Attainment, Elementary Secondary Education
Peer reviewedVrancic, Daniela; Nanclares, Valeria; Soares, Delfina; Kulesz, Analia; Mordzinski, Claudia; Plebst, Christian; Starkstein, Sergio – Journal of Autism and Developmental Disorders, 2002
A study involving 30 Argentineans with autism evaluated the validity of the Autism Diagnostic Inventory-Telephone Screening in Spanish (ADI-TSS). The final version of the ADI-TSS could be assessed in 20 to 40 minutes and demonstrated a high validity, high interrater reliability, and high internal consistency. (Contains references.) (Author/CR)
Descriptors: Adults, Autism, Disability Identification, Foreign Countries
Peer reviewedAnderson, Stephen A. – Michigan Reading Journal, 2002
Considers the development of an inter-rater reliability correlation comparing the judgments, or scores, or each judge to see if their observations are similar. Presents a case study of the Northville Public Schools' data for the 2000 MEAP (Michigan Educational Assessment Program) Writing Test. Concludes that in this case study the state fails both…
Descriptors: Case Studies, Elementary Education, Evaluation Research, Interrater Reliability
Peer reviewedRyser, Gail R. – Journal of Secondary Gifted Education, 1994
The meanings of reliability and validity as they apply to standardized measures are used as a framework for applying the concepts of reliability and validity to authentic assessments. This article sees reliability as scorability and stability, whereas validity is seen as students' ability to use knowledge authentically in the field. (DB)
Descriptors: Elementary Secondary Education, Evaluation Methods, Performance Based Assessment, Reliability
Peer reviewedLewis, Kerry E. – American Journal of Speech-Language Pathology, 1995
An examination of the extent to which scores on the Stuttering Severity Instrument (SSI) for Children and Adults, Third Edition, accurately reflect 10 judges' observations of stuttering behaviors found that SSI scores obscured the wide range of judges' raw counts and did not accurately reflect the observational data from which they were derived.…
Descriptors: Adults, Children, Evaluation Methods, Interrater Reliability
Peer reviewedSimpson, Robert G. – Behavioral Disorders, 1991
The behavior of each of 120 students in grades 9-12 was rated by 2 of the student's teachers using the Revised Behavior Problem Checklist. Results indicated a generally low to moderate degree of relationship among teacher ratings. It is recommended that clinicians collect behavioral ratings from many raters before reaching diagnostic conclusions.…
Descriptors: Behavior Problems, Check Lists, Clinical Diagnosis, Interrater Reliability
Mabry, Linda – Phi Delta Kappan, 1999
Education remains heavily shackled by punitive, test-driven reform. Despite reasonable alternatives, testing increasingly drives educational accountability and reform. Standardization of direct writing assessments promotes scoring reliability and facilitates educational comparisons and rankings. However, standardized writing is not good writing,…
Descriptors: Elementary Secondary Education, Interrater Reliability, Performance Based Assessment, Scoring Rubrics
Peer reviewedNordin, Viviann; Gillberg, Christopher; Nyden, Agneta – Journal of Autism and Developmental Disorders, 1998
This study assessed the interrater reliability of a Swedish version of the Childhood Autism Rating Scale (CARS), an instrument for screening and diagnosis of autism. The CARS was used for rating autistic behavior by two investigators in 25 children. Results indicated fair to excellent agreement. Aspects of validity and reliability are discussed.…
Descriptors: Autism, Behavior Rating Scales, Clinical Diagnosis, Disability Identification


