Publication Date
| In 2026 | 7 |
| Since 2025 | 690 |
| Since 2022 (last 5 years) | 3191 |
| Since 2017 (last 10 years) | 7432 |
| Since 2007 (last 20 years) | 15070 |
Descriptor
| Test Reliability | 15055 |
| Test Validity | 10290 |
| Reliability | 9763 |
| Foreign Countries | 7150 |
| Test Construction | 4828 |
| Validity | 4192 |
| Measures (Individuals) | 3880 |
| Factor Analysis | 3826 |
| Psychometrics | 3532 |
| Interrater Reliability | 3126 |
| Correlation | 3040 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1329 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 253 |
| Taiwan | 234 |
| Netherlands | 224 |
| Spain | 218 |
| California | 215 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Laenen, Annouschka; Alonso, Ariel; Molenberghs, Geert – Psychometrika, 2007
A new measure for reliability of a rating scale is introduced, based on the classical definition of reliability, as the ratio of the true score variance and the total variance. Clinical trial data can be employed to estimate the reliability of the scale in use, whenever repeated measurements are taken. The reliability is estimated from the…
Descriptors: Schizophrenia, Rating Scales, Likert Scales, True Scores
Klein, Britt; McCall, Louise; Austin, David; Piterman, Leon – British Journal of Educational Technology, 2007
Sixty-six English-speaking postgraduate distance-education medical students completed the Learning Styles Questionnaire (LSQ: 40-item version). This was completed while attending a residential workshop at the beginning of the semester, and 44 of these students completed the same LSQ questionnaire 5 months later at the completion of the semester.…
Descriptors: Questionnaires, Psychometrics, Medical Students, Factor Analysis
Colom, Roberto; Abad, Francisco J. – Intelligence, 2007
Mackintosh and Bennett's [Mackintosh, N. J. and Bennett, E. S, (2005). ''What do Raven's Matrices measure? An analysis in terms of sex differences.'' Intelligence 33: 663-674.] study shows that males outperform females in some APM items but not in others, implicating that these items are measuring discriminable mental processes. The present…
Descriptors: Test Bias, Gender Differences, Cognitive Processes, Measures (Individuals)
Campbell, Peter – Phi Delta Kappan, 2007
In this rejoinder to John Chubb's reply to "Edison Is the Symptom, NCLB Is the Disease," the author argues that Edison offers feel-good measures without really solving any of the problem of schools in poverty. Defending his original argument, the author cites a RAND study that questions the results Chubb claims. The study indicates the…
Descriptors: Reader Response, Academic Achievement, Educationally Disadvantaged, Data Interpretation
Tasse, Marc J.; And Others – 1994
The Quebec Adaptive Behavior Scale (QABS) is widely used in Quebec (Canada) to assess behavior of people with mental retardation in educational, vocational, residential or hospital settings. This study estimated the interrater agreement and test-retest reliability of the QABS. To determine test-retest reliability, the QABS was completed by 27…
Descriptors: Adaptive Behavior (of Disabled), Behavior Rating Scales, Elementary Secondary Education, Foreign Countries
Ahadi, Stephan A.; And Others – 1990
The reliability and validity of teacher ratings, the relationship between teacher ratings and principal self-reports of instructional leadership, and the degree to which they are influenced by demographic factors are examined in this study. Methodology involved completion of the Instructional Leadership Inventory, a self-report measure, by 81…
Descriptors: Educational Environment, Elementary Secondary Education, Institutional Characteristics, Instructional Leadership
Aydin, Selami – Turkish Online Journal of Educational Technology - TOJET, 2006
This research aimed to investigate the effect of computers on the test and inter-rater reliability of writing test scores of ESL learners. Writing samples of 20 pen-paper and 20 computer group students were scored in analytic scoring method by two scorers, and then the scores were analyzed in Alpha (Cronbach) model. The results showed that the…
Descriptors: Foreign Countries, College Students, Computer Assisted Testing, English (Second Language)
Bunch, Michael B.; Littlefair, Wendy – 1988
A total of 2,000 essays written by 1,000 students was submitted to generalizability analyses for domain-referenced tests. Each student had written one essay on each of two prompts representing two models of discourse. Each essay was read by six readers and judged on a scale of from 1 to 4. No reader read essays from both prompts. Reader agreement…
Descriptors: Cutting Scores, Essay Tests, Generalizability Theory, Interrater Reliability
Primoff, Ernest S. – 1971
This report shows how Beta weights for the J-Coefficient may be easily developed without a formal validity study, and indicates how indications of ability other than tests can be used to measure the same abilities that are measured by tests. See also TM 001 163-64,166 for further information on job elements (J-Scale) procedures. (Author/DLG)
Descriptors: Achievement Rating, Correlation, Evaluation Criteria, Occupational Tests
Love, Judith A.; And Others – 1977
Perhaps more than ever before, college teaching is being studied and evaluated. This paper describes the development of a simple descriptive instrument used to focus observers' classifications and ratings of college teachers' instructional behaviors as recorded on video tape. The need for such an instrument is reviewed, the methodology for testing…
Descriptors: Classroom Observation Techniques, College Instruction, Correlation, Factor Analysis
Gilbert, Sharon L. – 1997
This study examined whether variations in the Developmental Observation Checklist (DC) format influences congruence of scores among both parents and the child's teacher. The DC was varied by adding pictorial illustrations and examples and having three response categories instead of two. Results from 100 sets of participants were evaluated with…
Descriptors: Check Lists, Developmental Delays, Early Intervention, Fathers
Peer reviewedLee, Steven W.; And Others – Behavioral Disorders, 1994
The Child Behavior Checklist and related forms were completed for 171 boys referred for school-based assessment resulting from academic and/or behavioral problems. Adolescents consistently underreported behavioral problems relative to parents and teachers regardless of subsequent diagnosis. Implications of these discrepancies in school-based…
Descriptors: Adolescents, Behavior Problems, Disability Identification, Educational Diagnosis
Peer reviewedMcCrae, Robert R. – Multivariate Behavioral Research, 1993
To assess cross-observer agreement on personality profiles, an Index of Profile Agreement and an associated coefficient are proposed that take into account both the difference between the ratings and the extremes of their mean. Data from the Revised NEO Personality Inventory for 250 peer ratings/self-reports and 68 spouse ratings/self-reports…
Descriptors: Adults, Comparative Analysis, Equations (Mathematics), Evaluation Methods
Peer reviewedOren, Thomas A.; Ruhl, Kathy L. – Early Childhood Education Journal, 2000
Investigated the reliability and item appropriateness, as discerned by adults affiliated with an infant center, of the Caregiver-Environment Scale (CES). Found the CES to be an easy to use, reliable instrument for evaluation. (Author/SD)
Descriptors: Caregiver Child Relationship, Child Caregivers, Child Development, Day Care
Rockwell, Pam; Dunham, Mardis – Art Therapy: Journal of the American Art Therapy Association, 2006
This study explored the use of the Formal Elements Art Therapy Scale (FEATS) with a population of persons with a DSM-IV diagnosis of Substance Use Disorder who were court ordered for treatment. Two groups of adults (N = 40) were closely matched on age, gender, race, socioeconomic status and education level, and were administered the Person Picking…
Descriptors: Measures (Individuals), Interrater Reliability, Group Membership, Art Therapy

Direct link
