Publication Date
| In 2026 | 7 |
| Since 2025 | 690 |
| Since 2022 (last 5 years) | 3191 |
| Since 2017 (last 10 years) | 7432 |
| Since 2007 (last 20 years) | 15070 |
Descriptor
| Test Reliability | 15055 |
| Test Validity | 10290 |
| Reliability | 9763 |
| Foreign Countries | 7150 |
| Test Construction | 4828 |
| Validity | 4192 |
| Measures (Individuals) | 3880 |
| Factor Analysis | 3826 |
| Psychometrics | 3532 |
| Interrater Reliability | 3126 |
| Correlation | 3040 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1329 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 253 |
| Taiwan | 234 |
| Netherlands | 224 |
| Spain | 218 |
| California | 215 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Atkinson, Nancy L. – American Journal of Health Behavior, 2007
Objectives: To design a valid and reliable questionnaire to assess perceived attributes of technology-based health education innovations. Methods: College students in 12 personal health courses reviewed a prototype eHealth intervention using a 30-item instrument based upon diffusion theory's perceived attributes of an innovation. Results:…
Descriptors: Health Education, Reliability, Hygiene, Measures (Individuals)
Alexander, Jennifer K.; Scherer, Robert F.; Lecoutre, Marc – Journal of Education for Business, 2007
The authors compared business journal ranking systems from 6 countries. Results revealed a low degree of agreement among the systems, and a low to moderate relationship between pairs of systems. In addition, the French and United Kingdom ranking systems were different from each other and from the systems in Australia, Germany, Hong Kong, and the…
Descriptors: Foreign Countries, Comparative Education, Journal Articles, Business
Erford, Bradley T.; Klein, Lauren – Educational and Psychological Measurement, 2007
The Slosson-Diagnostic Math Screener (S-DMS) was designed to help identify students in Grades 1 to 8 at risk for mathematics failure. Internal consistency, test-retest reliability, item analysis, decision efficiency, convergent validity, and factorial validity of all five levels of the S-DMS were studied using 20 independent samples of students…
Descriptors: Grade 1, Test Validity, Item Analysis, Test Reliability
Pfordresher, Peter Q.; Palmer, Caroline; Jungers, Melissa K. – Cognitive Science, 2007
The production of complex sequences like music or speech requires the rapid and temporally precise production of events (e.g., notes and chords), often at fast rates. Memory retrieval in these circumstances may rely on the simultaneous activation of both the current event and the surrounding context (Lashley, 1951). We describe an extension to a…
Descriptors: Memory, Music, Serial Ordering, Sequential Learning
Gardner-Kitt, Donna L.; Worrell, Frank C. – Journal of Adolescence, 2007
In this study, we examined the reliability and validity of Cross Racial Identity Scale (CRIS; Vandiver, B. J., Cross Jr., W. E., Fhagen-Smith, P. E., Worrell, F. C., Swim, J. K., & Caldwell, L. D. (2000). "The Cross Racial Identity Scale." Unpublished scale; Worrell, F. C., Vandiver, B. J., & Cross Jr., W. E., (2004). "The Cross Racial Identity…
Descriptors: Measures (Individuals), High School Students, Reliability, Racial Identification
Erford, Bradley T.; Balcom, Lindsey C.; Moore-Thomas, Cheryl – Measurement and Evaluation in Counseling and Development, 2007
This study provides preliminary analysis of reliability and validity of scores on the Screening Test for Emotional Problems, which was designed to identify students ages 5 to 18 years who are referred for wide-ranging emotional disturbances categorized under the Individuals With Disabilities Education Improvement Act (U.S. Department of Education,…
Descriptors: Emotional Problems, Disabilities, Test Validity, Screening Tests
Scherer, Marcia J.; McKee, Barbara G. – 1992
Validity and reliability data are presented for two instruments for assessing the predispositions that people have toward the use of assistive and educational technologies. The two instruments, the Assistive Technology Device Predisposition Assessment (ATDPA) and the Educational Technology Predisposition Assessment (ETPA), are self-report…
Descriptors: Assistive Devices (for Disabled), Attitude Measures, Check Lists, College Students
Kaplan, Bruce A.; Johnson, Eugene G. – 1992
Across the field of educational assessment the case has been made for alternatives to the multiple-choice item type. Most of the alternative types of items require a subjective evaluation by a rater. The reliability of this subjective rating is a key component of these types of alternative items. In this paper, measures of reliability are…
Descriptors: Educational Assessment, Elementary Secondary Education, Estimation (Mathematics), Evaluators
Aghbar, Ali-Asghar – 1986
The effectiveness of the "read-comp" technique in assessing writing ability and the usefulness of a rubric and procedure devised for scoring read-comp samples and essays were evaluated. Subjects were 100 freshman students enrolled in general and remedial English classes in a 6-week summer session at Indiana University of Pennsylvania.…
Descriptors: College Freshmen, Essay Tests, Evaluation Methods, Grading
Goldstein, Harvey; Wolf, Alison – 1986
Locally developed occupational tests were administered to 16- and 17-year-olds in a government-sponsored vocational education program in the United Kingdom over a six-month period in 1984. Job skills were tested in two occupational areas: use of a micrometer and invoice completion. Some performance tests were designed by researchers and some by…
Descriptors: Comparative Testing, Criterion Referenced Tests, Evaluation Criteria, Foreign Countries
Cronin, Linda; Capie, William – 1986
The influence of day-to-day variation in teacher performance on the reliability and validity of teacher assessment was examined. An attempt was made to identify and quantify sources of score variation attributable to differences in teacher performance, day of observation, observers, and test subscales; and to determine their effects on reliability…
Descriptors: Behavior Change, Behavior Rating Scales, Classroom Observation Techniques, Evaluation Methods
SCHWAGER, SIDNEY – 1967
IN THIS REPORT THE UNITED FEDERATION OF TEACHERS (UFT) ANALYZES SPECIFIC DATA FROM THE CENTER FOR URBAN EDUCATION'S (CUE) NEGATIVE EVALUATION OF NEW YORK CITY'S MORE EFFECTIVE SCHOOLS (MES) PROGRAM AND CHARGES THAT CUE'S CONCLUSIONS ARE INVALID. THE UFT MAINTAINS THAT SINCE 18 OF THE 21 MES WERE FORMER SPECIAL SERVICE (SS) SCHOOLS, CUE SHOULD HAVE…
Descriptors: Achievement Gains, Arithmetic, Comparative Analysis, Control Groups
Gillmore, Gerald M. – 1979
It is argued in this paper that generalizability theory provides a uniquely useful framework for defining and quantifying the dependability of data for decision making. It does so by requiring careful specification of the conditions of measurement and the anticipated sources of variation in the results of the measurement procedure. A distinction…
Descriptors: Analysis of Variance, Criterion Referenced Tests, Decision Making, Educational Assessment
Peer reviewedBradley, Robert H.; Corwyn, Robert F.; Caldwell, Betty M.; Whiteside-Mansell, Leanne; Mink, Iris T. – Journal of Research on Adolescence, 2000
Describes the development of the Early Adolescent version of the Home Observation for Measurement of the Environment (EA-HOME) Inventory. Presents information on its usefulness with African Americans, Chinese Americans, European Americans, Mexican Americans, and Dominican Americans. Notes findings indicating high interobserver agreement, with…
Descriptors: Black Youth, Child Development, Chinese Americans, Cultural Differences
Peer reviewedHollenbeck, Keith; Tindal, Gerald; Almond, Patricia – Educational Assessment, 1999
Studied the amount of measurement error in a state's performance-based writing task as it relates to high-stakes decision reproducibility. Using 175 eighth-grade writing samples, the study finds moderate correlations between the two raters' scores, with significant differences for the rates for the handwritten, but not the typed, essays.(SLD)
Descriptors: Decision Making, Error of Measurement, Essay Tests, Grade 8

Direct link
