Publication Date
In 2025 | 9 |
Since 2024 | 31 |
Since 2021 (last 5 years) | 88 |
Since 2016 (last 10 years) | 197 |
Since 2006 (last 20 years) | 401 |
Descriptor
Test Interpretation | 3974 |
Test Validity | 958 |
Test Construction | 688 |
Elementary Secondary Education | 677 |
Scores | 650 |
Test Results | 623 |
Test Reliability | 622 |
Testing | 549 |
Achievement Tests | 510 |
Standardized Tests | 490 |
Testing Problems | 488 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Practitioners | 274 |
Researchers | 122 |
Teachers | 102 |
Administrators | 63 |
Counselors | 28 |
Parents | 21 |
Policymakers | 21 |
Students | 15 |
Community | 8 |
Location
Canada | 44 |
California | 33 |
Australia | 32 |
United Kingdom | 23 |
United States | 19 |
Pennsylvania | 18 |
United Kingdom (England) | 16 |
New York | 15 |
Michigan | 14 |
Japan | 13 |
New Jersey | 12 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Hana Svobodová; Petr Trahorsch – International Research in Geographical and Environmental Education, 2025
Geographical Olympiads are disciplinary competitions that can be a tool for assessing geographical knowledge and skills in different countries of the world. This article aims to analyse the results of the national and international geography Olympiads and to identify their conditionality and interrelationship. The secondary aim is to find out…
Descriptors: Foreign Countries, Geography, Geography Instruction, Evaluation Methods
Kuhn, Melissa Gayle – ProQuest LLC, 2022
Validity in psychometrics refers to the degree to which evidence and theory supports the interpretations drawn from a test, and Messick's Contemporary Validity Theory (1994) includes several facets with well-established evidence collection methods. However, there is a lack of consensus on appropriate methods of evaluating the facet of…
Descriptors: Test Validity, Psychometrics, Test Interpretation, Scores
Frank Goldhammer; Ulf Kroehne; Carolin Hahnel; Johannes Naumann; Paul De Boeck – Journal of Educational Measurement, 2024
The efficiency of cognitive component skills is typically assessed with speeded performance tests. Interpreting only effective ability or effective speed as efficiency may be challenging because of the within-person dependency between both variables (speed-ability tradeoff, SAT). The present study measures efficiency as effective ability…
Descriptors: Timed Tests, Efficiency, Scores, Test Interpretation
Puttaswamy, Ash; Barone, Anjelica; Viezel, Kathleen D.; Willis, John O.; Dumont, Ron – Journal of Psychoeducational Assessment, 2020
An area of particular importance when examining index scores on the Wechsler Intelligence Scale for Children--Fifth Edition (WISC-V) is the utilization and interpretation of critical values and base rates associated with differences between an individual's subtest scaled score and the individual's mean scaled score for an index. For the WISC-V,…
Descriptors: Children, Intelligence Tests, Scores, Differences
Danielle R. Blazek; Jason T. Siegel – International Journal of Social Research Methodology, 2024
Social scientists have long agreed that satisficing behavior increases error and reduces the validity of survey data. There have been numerous reviews on detecting satisficing behavior, but preventing this behavior has received less attention. The current narrative review provides empirically supported guidance on preventing satisficing by…
Descriptors: Response Style (Tests), Responses, Reaction Time, Test Interpretation
Kylie Gorney; Sandip Sinharay – Journal of Educational Measurement, 2025
Although there exists an extensive amount of research on subscores and their properties, limited research has been conducted on categorical subscores and their interpretations. In this paper, we focus on the claim of Feinberg and von Davier that categorical subscores are useful for remediation and instructional purposes. We investigate this claim…
Descriptors: Tests, Scores, Test Interpretation, Alternative Assessment
Farmer, Ryan L.; Kim, Samuel Y. – Psychology in the Schools, 2020
Many prominent intelligence tests (e.g., Wechsler Intelligence Scale for Children, Fifth Edition [WISC-V] and Reynolds Intellectual Abilities Scale, Second Edition [RIAS-2]) offer methods for computing subtest- and composite-level difference scores. This study uses data provided in the technical manual of the WISC-V and RIAS-2 to calculate…
Descriptors: Children, Intelligence Tests, Scores, Test Reliability
Chao Han; Binghan Zheng; Mingqing Xie; Shirong Chen – Interpreter and Translator Trainer, 2024
Human raters' assessment of interpreting is a complex process. Previous researchers have mainly relied on verbal reports to examine this process. To advance our understanding, we conducted an empirical study, collecting raters' eye-movement and retrospection data in a computerised interpreting assessment in which three groups of raters (n = 35)…
Descriptors: Foreign Countries, College Students, College Graduates, Interrater Reliability
Benton, Tom; Williamson, Joanna – Research Matters, 2022
Equating methods are designed to adjust between alternate versions of assessments targeting the same content at the same level, with the aim that scores from the different versions can be used interchangeably. The statistical processes used in equating have, however, been extended to statistically "link" assessments that differ, such as…
Descriptors: Statistical Analysis, Equated Scores, Definitions, Alternative Assessment
Barnes, Amy C. – New Directions for Student Leadership, 2021
This article explores the ethical use of assessments in leadership training, education, and development. From the importance of having well-trained facilitators to the consideration of power and social identity in the interpretation of individual results, this article advocates for approaching the use of leadership assessments and inventories with…
Descriptors: Leadership, Measures (Individuals), Ethics, Test Use
Ing, Marsha; Chinen, Starlie; Jackson, Kara; Smith, Thomas M. – Educational Measurement: Issues and Practice, 2021
Despite the ease of accessing a wide range of measures, little attention is given to validity arguments when considering whether to use the measure for a new purpose or in a different context. Making a validity argument has historically focused on the intended interpretation and use. There has been a press to consider both the intended and actual…
Descriptors: Instructional Improvement, Measures (Individuals), Test Validity, Test Interpretation
Gorney, Kylie – ProQuest LLC, 2023
Aberrant behavior refers to any type of unusual behavior that would not be expected under normal circumstances. In educational and psychological testing, such behaviors have the potential to severely bias the aberrant examinee's test score while also jeopardizing the test scores of countless others. It is therefore crucial that aberrant examinees…
Descriptors: Behavior Problems, Educational Testing, Psychological Testing, Test Bias
Hannah E. Luce – ProQuest LLC, 2023
Young children are assessed to meet federal mandates and inform policy decisions, provide teachers with useful information to make instructional decisions and set reasonable learning goals, and facilitate communication with families. While young children are frequently assessed using whole-child assessments which often yield criterion-referenced…
Descriptors: Scores, Norm Referenced Tests, Test Interpretation, Student Evaluation
Marta Godoy-Giménez; Ángel García-Pérez; Fernando Cañadas; Angeles F. Estévez; Pablo Sayans-Jiménez – Autism: The International Journal of Research and Practice, 2024
The broad autism phenotype is the phenotypic expression of the primary characteristics of autism. However, currently available tests do not agree with the two-domain operationalization of broad autism phenotype or autism, and their internal structure has shown instability across applications. This study presents the Broad Autism…
Descriptors: Autism Spectrum Disorders, Genetics, Diagnostic Tests, Foreign Countries
Kseniia Marcq; Johan Braeken – Educational Assessment, Evaluation and Accountability, 2024
Gender differences in item nonresponse are well-documented in high-stakes achievement tests, where female students are shown to omit more items than male students. These gender differences in item nonresponse are often linked to differential risk-taking strategies, with females being risk-averse and unwilling to guess on an item, even if it could…
Descriptors: Secondary School Students, International Assessment, Gender Differences, Response Rates (Questionnaires)