Publication Date
| In 2026 | 3 |
| Since 2025 | 666 |
| Since 2022 (last 5 years) | 3167 |
| Since 2017 (last 10 years) | 7408 |
| Since 2007 (last 20 years) | 15046 |
Descriptor
| Test Reliability | 15036 |
| Test Validity | 10272 |
| Reliability | 9759 |
| Foreign Countries | 7141 |
| Test Construction | 4823 |
| Validity | 4191 |
| Measures (Individuals) | 3877 |
| Factor Analysis | 3825 |
| Psychometrics | 3525 |
| Interrater Reliability | 3124 |
| Correlation | 3039 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1327 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 252 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 216 |
| California | 214 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Benbassat, Jochanan; Baumal, Reuben – Advances in Health Sciences Education, 2007
Decisions about admissions to medical school are based on assessments of the applicants' cognitive achievements and non-cognitive traits. Admission criteria are expected to be fair, transparent, evidence-based and legally defensible. However, unlike cognitive criteria, which are highly reliable and moderately valid, the reliability and validity of…
Descriptors: Medical Students, College Applicants, Medical Schools, Validity
Pyles, Loretta – Research on Social Work Practice, 2007
Objective: This article reports on the reliability and validity of a new instrument called the Resource Generating Strategies (RGS) Scale, which was created to measure participation in the informal economy. Method: Researchers interviewed 285 adult women who had received domestic violence services, were currently incarcerated, or were residing in…
Descriptors: Validity, Reliability, Measures (Individuals), Adults
Orme, John G.; Cuddeback, Gary S.; Buehler, Cheryl; Cox, Mary Ellen; Le Prohn, Nicole S. – Research on Social Work Practice, 2007
Objective: The Casey Foster Applicant Inventory-Applicant Version (CFAI-A) is a new standardized self-report measure designed to assess the potential to foster parent successfully. The CFAI-A is described, and results concerning its psychometric properties are presented. Method: Data from a sample of 304 foster mothers from 35 states are analyzed.…
Descriptors: Foster Care, Parents, Measurement, Questionnaires
Lefforge, Noelle L.; Donohue, Brad; Strada, Marilyn J. – Behavior Therapy, 2007
Patient nonattendance to scheduled sessions results in excessive costs to mental health and substance abuse providers and compromises the care of clients. This paper presents a comprehensive review of interventions that have been shown to increase session attendance rates in these settings. Unique to other review papers, reliability estimates were…
Descriptors: Substance Abuse, Attendance, Mental Health, Reliability
Hoyt, William T. – Psychological Methods, 2007
Rater biases are of interest to behavior genetic researchers, who often use ratings data as a basis for studying heritability. Inclusion of multiple raters for each sibling pair (M. Bartels, D. I. Boomsma, J. J. Hudziak, T. C. E. M. van Beijsterveldt, & E. J. C. G. van den Oord, 2007) is a promising strategy for controlling bias variance and may…
Descriptors: Research Design, Research Methodology, Genetics, Validity
Childs, Ruth A.; Dunn, Jennifer L.; van Barneveld, Christina; Jaciw, Andrew P. – International Journal of Testing, 2007
This study compares five scoring approaches for a test of clinical reasoning skills. All of the approaches incorporate information about the correct item responses selected and the errors, such as selecting too many responses or selecting a response that is inappropriate and/or harmful to the patient. The approaches are combinations of theoretical…
Descriptors: Scoring, Clinical Diagnosis, Thinking Skills, Reliability
Lee, Young-Sun; Douglas, Jeffrey; Chewning, Betty – Social Indicators Research, 2007
Clinical and health policy research frequently involves health status measurement using generic or disease specific instruments. These instruments are generally developed to arrive at several scales, each measuring a distinct domain of health quality of life (HQOL). Clinical settings are starting to explore how to integrate patient perspectives of…
Descriptors: Health Conditions, Quality of Life, Validity, Measures (Individuals)
Schwandt, Thomas A.; Lincoln, Yvonna S.; Guba, Egon G. – New Directions for Evaluation, 2007
Among the most knotty problems faced by investigators committed to interpretive practices in disciplines and fields such as sociocultural anthropology, jurisprudence, literary criticism, historiography, feminist studies, public administration, policy analysis, planning, educational research, and evaluation are deciding whether an interpretation is…
Descriptors: Data Interpretation, Validity, Credibility, Evaluative Thinking
Bell, Sherry Mee; McCallum, R. Steve; Kirk, Emily R.; Fuller, Emily J.; McCane-Bowling, Sara – Assessment for Effective Intervention, 2007
The purpose of this study was to examine the psychometric integrity of the "Test of Silent Contextual Reading Fluency" (TOSCRF) by D. D. Hammill, J. L. Wiederholt, and E. A. Allen (2006). The TOSCRF is a recently published assessment of reading fluency for ages 7 years 0 months to 18 years 11 months that can be administered individually or in a…
Descriptors: Reading Fluency, Reading Achievement, Validity, Reliability
Carney, Megan Strawsine – ProQuest LLC, 2012
This paper describes the confirmatory factor analysis, validity, and reliability data collection stage of the development of a scale to measure mainstream teachers' self-efficacy beliefs for teaching ELL (English Language Learner) students. Data were collected from 708 K through 12 teachers and pre-service teachers with varying degrees of…
Descriptors: English Language Learners, English Instruction, Rating Scales, Test Reliability
Davies, Dan; Collier, Chris; Howe, Alan – International Journal of Technology and Design Education, 2012
This article reports on the outcomes from the "e-scape Primary Scientific and Technological Understanding Assessment Project" (2009-2010), which aimed to support primary teachers in developing valid portfolio-based tasks to assess pupils' scientific and technological enquiry skills at age 11. This was part of the wider…
Descriptors: Foreign Countries, Evidence, Video Technology, Portfolios (Background Materials)
Boldt, R. F. – 1992
The Test of Spoken English (TSE) is an internationally administered instrument for assessing nonnative speakers' proficiency in speaking English. The research foundation of the TSE examination described in its manual refers to two sources of variation other than the achievement being measured: interrater reliability and internal consistency.…
Descriptors: Adults, Analysis of Variance, Interrater Reliability, Language Proficiency
Weare, Jane; And Others – 1987
This annotated bibliography was developed upon noting a deficiency of information in the literature regarding the training of raters for establishing agreement. The ERIC descriptor, "Interrater Reliability", was used to locate journal articles. Some of the 33 resulting articles focus on mathematical concepts and present formulas for computing…
Descriptors: Annotated Bibliographies, Cloze Procedure, Correlation, Essay Tests
Livingston, Samuel A. – 1976
A distinction is made between reliability of measurement and reliability of classification; the "criterion-referenced reliability coefficient" describes the former. Application of this coefficient to the probability distribution of possible scores for a single student yields a meaningful way to describe the reliability of a single score. (Author)
Descriptors: Classification, Criterion Referenced Tests, Error of Measurement, Measurement
Newtson, Darren; And Others – 1976
Two five-week test-retest reliability studies of a measure of the unit of perception of ongoing behavior were conducted. In the first, 25 females and 23 males segmented a 7-minute action sequence under fine-unit or gross-unit instructional sets. Number of units marked at first viewing correlated .87 with number of units at retest. Correlations…
Descriptors: Attribution Theory, Behavior Patterns, Behavior Rating Scales, Cognitive Processes

Peer reviewed
Direct link
