Publication Date
| In 2026 | 3 |
| Since 2025 | 675 |
| Since 2022 (last 5 years) | 3176 |
| Since 2017 (last 10 years) | 7417 |
| Since 2007 (last 20 years) | 15055 |
Descriptor
| Test Reliability | 15043 |
| Test Validity | 10279 |
| Reliability | 9761 |
| Foreign Countries | 7144 |
| Test Construction | 4825 |
| Validity | 4191 |
| Measures (Individuals) | 3877 |
| Factor Analysis | 3825 |
| Psychometrics | 3526 |
| Interrater Reliability | 3124 |
| Correlation | 3040 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1328 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 253 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 217 |
| California | 215 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Torr, J.; Iacono, T.; Graham, M. J.; Galea, J. – Journal of Intellectual Disability Research, 2008
Background: In Australia, diagnosis and management of depression in adults with intellectual disability (ID) often occurs within the primary care setting. Few tools are available to assist general practitioners (GPs) in the diagnostic process. The study aim was to assess properties of carer and GP checklists developed to address this problem.…
Descriptors: Check Lists, Mental Retardation, Test Validity, Identification
Vivo, Juana-Maria; Franco, Manuel – International Journal of Mathematical Education in Science and Technology, 2008
This article attempts to present a novel application of a method of measuring accuracy for academic success predictors that could be used as a standard. This procedure is known as the receiver operating characteristic (ROC) curve, which comes from statistical decision techniques. The statistical prediction techniques provide predictor models and…
Descriptors: Academic Achievement, Item Response Theory, Criterion Referenced Tests, Predictor Variables
Tekin-Iftar, Elif – Education and Training in Developmental Disabilities, 2008
The present study was designed to determine whether parents (three mothers and one grandmother) could implement CBI with SP reliably for teaching community skills to their children and the effects of parent-delivered intervention on teaching the community skills. Maintenance and generalization effects of the intervention were also analyzed in the…
Descriptors: Community Based Instruction (Disabilities), Parents as Teachers, Children, Developmental Disabilities
Kulikowich, Jonna M.; Mason, Linda H.; Brown, Scott W. – Reading and Writing: An Interdisciplinary Journal, 2008
Drawing from multiple theoretical frameworks representing cognitive and educational psychology, we present a writing task and scoring system for measurement of students' informative writing. Participants in this study were 72 fifth- and sixth-grade students who wrote compositions describing real-world problems and how mathematics, science, and…
Descriptors: World Problems, Expository Writing, Educational Psychology, Validity
Alderson, J. Charles; And Others – 1995
The guide is intended for teachers who must construct language tests and for other professionals who may need to construct, evaluate, or use the results of language tests. Most examples are drawn from the field of English-as-a-Second-Language instruction in the United Kingdom, but the principles and practices described may be applied to the…
Descriptors: Educational Trends, English (Second Language), Interrater Reliability, Language Tests
Morgan, George A.; Bartholomew, Sheridan – 1998
This study examined the reliability and construct validity of two types of measures of mastery motivation for elementary school children: a new version of the Dimensions of Mastery Questionnaires (DMQ) and behavioral mastery tasks. Participating were 64 mostly middle class and Caucasian 7- and 10-year-olds living in a middle-sized western city.…
Descriptors: Childhood Attitudes, Construct Validity, Elementary Education, Elementary School Students
Rothman, M. L.; And Others – 1982
A practical application of generalizability theory, demonstrating how the variance components contribute to understanding and interpreting the data collected to evaluate a program, is described. The evaluation concerned 120 learning modules developed for the Dental Auxiliary Education Project. The goals of the project were to design, implement,…
Descriptors: Correlation, Data Collection, Dental Schools, Educational Research
Reed, Donald B.; And Others – 1988
An instrument was developed to assess principal leadership. Two studies were then conducted to assess the reliability, validity, and utility of the instrument. Leadership style is the relative intensity of the presence of four modes of authority (traditional, charismatic, legal, and expert authority) and four modes of power (moral, psychological,…
Descriptors: Administrator Evaluation, Administrators, Construct Validity, Educational Assessment
Cronin, Linda L.; Capie, William – 1985
The purpose of this study was to compare the scoring of Teacher Performance Assessment Instruments (TPAI) indicators using discrete descriptors when some are considered "essential" with the scoring of these same indicators, and when no descriptors are considered essential. The two questions addressed in this study were: (1) To what…
Descriptors: Analysis of Variance, Behavior Rating Scales, Classroom Observation Techniques, Data Collection
Fuchs, Douglas; And Others – 1985
The present investigation represents a systematic effort to determine whether handicapped children have been included in the development of test norms, items, and indices of reliability and validity. It analysed up-to-date user manuals and technical supplements of 27 well known and widely used aptitude and achievement tests. Study procedure…
Descriptors: Achievement Tests, Aptitude Tests, Disabilities, Elementary Secondary Education
Peer reviewedMeisels, Samuel J.; And Others – Early Childhood Research Quarterly, 1995
Examined the reliability and validity of the Work Sampling System (WSS) for evaluating the schoolwork of 100 kindergarten children. Results indicated that the WSS checklist and summary report had very high internal and moderately high interrater reliability. The WSS accurately predicted the performance of the children on a norm-referenced…
Descriptors: Academic Achievement, Achievement Tests, Check Lists, Early Childhood Education
Peer reviewedDunbar, Stephen B.; And Others – Applied Measurement in Education, 1991
Issues pertaining to the quality of performance assessments, including reliability and validity, are discussed. The relatively limited generalizability of performance across tasks is indicative of the care needed to evaluate performance assessments. Quality control is an empirical matter when measurement is intended to inform public policy. (SLD)
Descriptors: Educational Assessment, Generalization, Interrater Reliability, Measurement Techniques
Helms, LuAnn Sherbeck – 1999
This paper discusses the fact that reliability is about scores and not tests and how reliability limits effect sizes. The paper also explores the classical reliability coefficients of stability, equivalence, and internal consistency. Stability is concerned with how stable test scores will be over time, while equivalence addresses the relationship…
Descriptors: Effect Size, Meta Analysis, Reliability, Scores
Pepin, Michel – 1983
This paper presents three different ways of computing the internal consistency coefficient alpha for a same set of data. The main objective of the paper is the illustration of a method for maximizing coefficient alpha. The maximization of alpha can be achieved with the aid of a principal component analysis. The relation between alpha max. and the…
Descriptors: Research Methodology, Research Problems, Statistical Analysis, Test Items
Peer reviewedSchulman, Robert S.; Haden, Richard L. – Psychometrika, 1975
A model is proposed for the description of ordinal test scores based on the definition of true score as expected rank; its deviations are compared with results from classical test theory. An unbiased estimator of population true score from sample data is calculated. Score variance and population reliability are examined. (Author/BJG)
Descriptors: Career Development, Mathematical Models, Test Reliability, Test Theory

Direct link
