Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 2 |
| Since 2007 (last 20 years) | 8 |
Descriptor
| Reliability | 68 |
| Test Use | 68 |
| Validity | 43 |
| Elementary Secondary Education | 19 |
| Evaluation Methods | 17 |
| Test Construction | 17 |
| Educational Assessment | 14 |
| Scores | 14 |
| Student Evaluation | 12 |
| Psychometrics | 10 |
| Measurement Techniques | 9 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Elementary Secondary Education | 2 |
| High Schools | 1 |
| Higher Education | 1 |
| Postsecondary Education | 1 |
Audience
| Practitioners | 5 |
| Teachers | 4 |
| Administrators | 2 |
| Students | 1 |
Location
| Netherlands | 2 |
| Australia | 1 |
| Louisiana | 1 |
| New York | 1 |
| United Kingdom | 1 |
| United Kingdom (Northern… | 1 |
| United States | 1 |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
National Center on Improving Literacy, 2022
There are many available screeners for reading and other education or social-emotional outcomes. This brief outlines important things to consider when choosing and using a screener.
Descriptors: Screening Tests, Literacy, Social Emotional Learning, Decision Making
Ramsey Lee Cardwell – ProQuest LLC, 2022
The emergence of digital-first assessments is prompting reconsideration of, and innovation in, aspects of psychometrics, test validation, and test use. Using the Duolingo English Test (DET) as an example, this three-paper series seeks to address issues concerning the estimation of classification consistency and the reporting of results for such…
Descriptors: Classification, Reliability, Language Proficiency, Computer Assisted Testing
Kane, Michael T. – Journal of Educational Measurement, 2013
To validate an interpretation or use of test scores is to evaluate the plausibility of the claims based on the scores. An argument-based approach to validation suggests that the claims based on the test scores be outlined as an argument that specifies the inferences and supporting assumptions needed to get from test responses to score-based…
Descriptors: Test Interpretation, Validity, Scores, Test Use
Bennett, Jessica G.; Gardner, Ralph, III; Rizzi, Gleides Lopes – American Annals of the Deaf, 2013
Strong correlations exist between signed and/or spoken English and the literacy skills of deaf and hard of hearing students. Assessments that are both valid and reliable are key for researchers and practitioners investigating the signed and/or spoken English skills of signing populations. The authors conducted a literature review to explore which…
Descriptors: Deafness, Hearing Impairments, Sign Language, Language Skills
Kolen, Michael J.; Lee, Won-Chan – Educational Measurement: Issues and Practice, 2011
This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…
Descriptors: Test Use, Test Format, Error of Measurement, Raw Scores
Douglas, Karen M.; Mislevy, Robert J. – Journal of Educational and Behavioral Statistics, 2010
Important decisions about students are made by combining multiple measures using complex decision rules. Although methods for characterizing the accuracy of decisions based on a single measure have been suggested by numerous researchers, such methods are not useful for estimating the accuracy of decisions based on multiple measures. This study…
Descriptors: Educational Development, Test Use, Classification, Computation
Daniels, Stephanie K.; Schroeder, Mae Fern; DeGeorge, Pamela C.; Corey, David M.; Foundas, Anne L.; Rosenbek, John C. – American Journal of Speech-Language Pathology, 2009
Purpose: To continue the development of a quantified, standard method to differentiate individuals with stroke and dysphagia from individuals without dysphagia. Method: Videofluoroscopic swallowing studies (VFSS) were completed on a group of participants with acute stroke (n = 42) and healthy age-matched individuals (n = 25). Calibrated liquid…
Descriptors: Control Groups, Test Use, Neurological Impairments, Evaluation Methods
Herman, Joan L.; Osmundson, Ellen; Dietel, Ronald – Assessment and Accountability Comprehensive Center, 2010
This report describes the purposes of benchmark assessments and provides recommendations for selecting and using benchmark assessments--addressing validity, alignment, reliability, fairness and bias and accessibility, instructional sensitivity, utility, and reporting issues. We also present recommendations on building capacity to support schools'…
Descriptors: Multiple Choice Tests, Test Items, Benchmarking, Educational Assessment
Peer reviewedMusante, Linda; Treiber, Frank A.; Davis, Harry C.; Thompson, William O.; Waller, Jennifer L. – Assessment, 1999
Findings related to internal consistency, temporal stability, and principal components structures suggest that the Anger Expression Scale (C. Spielberger and others, 1985) and the Pediatric Anger Expression Scale (G. Jacobs and others, 1989), studied with a sample of 415 youth with a mean age of 14.7 years are acceptably reliable. (SLD)
Descriptors: Adolescents, Anger, Factor Structure, Reliability
Peer reviewedKrus, David J.; Helmstadter, Gerald C. – Educational and Psychological Measurement, 1993
Negative coefficients of reliability, sometimes returned by the standard formula for estimation of the internal-consistency reliability, are neither theoretically nor numerically correct. Alternative strategies for test development in this special case are suggested. (Author)
Descriptors: Estimation (Mathematics), Reliability, Test Construction, Test Use
Peer reviewedFeldt, Leonard S. – Applied Measurement in Education, 1997
It has often been asserted that the reliability of a measure places an upper limit on its validity. This article demonstrates in theory that validity can rise when reliability declines, even when validity evidence is a correlation with an acceptable criterion. Whether empirical examples can actually be found is an open question. (SLD)
Descriptors: Correlation, Criteria, Reliability, Test Construction
Peer reviewedStrahan, Robert F. – Journal of Vocational Behavior, 1987
Describes two new measures of consistency which refer to the extent to which more closely related scale types are found together in Holland's Self-Directed Search sort. One measure is based on the hexagonal model for use with three-point codes. The other is based on conditional probabilities for use with two-point codes. (Author/ABL)
Descriptors: Data Analysis, Data Interpretation, Personality Measures, Reliability
Hwang, Dae-Yeop; Henson, Robin K. – 2002
The Learning Style Inventory (LSI; Kolb, 1976; 1985 ) is a commonly used measure of learning styles based on Kolbs Experiential Learning Model. The psychometric soundness of LSI scores has been critiqued historically. This study reviewed the literature on the LSI and evaluated the psychometric properties of Kolbs original and revised versions of…
Descriptors: Cognitive Style, Meta Analysis, Psychometrics, Reliability
Peer reviewedCaruso, John C.; Witkiewitz, Katie – Journal of Educational Measurement, 2002
As an alternative to equally weighted difference scores, examined an orthogonal reliable component analysis (RCA) solution and an oblique principal components analysis (PCA) solution for the standardization sample of the Kaufman Assessment Battery for Children (KABC; A. Kaufman and N. Kaufman, 1983). Discusses the practical implications of the…
Descriptors: Ability, Academic Achievement, Children, Factor Analysis
Peer reviewedFisher, Anne G.; Bryze, Kimberly; Atchison, Bradley T. – Journal of Outcome Measurement, 2000
Studied rater reliability, internal scale validity, and person response validity of the School Assessment of Motor and Process Skills (School AMPS) using results for 208 elementary school students, some with educationally related disabilities. Results support rater reliability, scale validity, and person response validity of the School AMPS as a…
Descriptors: Disabilities, Elementary Education, Elementary School Students, Reliability

Direct link
