Publication Date
| In 2026 | 0 |
| Since 2025 | 17 |
| Since 2022 (last 5 years) | 74 |
| Since 2017 (last 10 years) | 189 |
| Since 2007 (last 20 years) | 384 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 274 |
| Researchers | 122 |
| Teachers | 102 |
| Administrators | 63 |
| Counselors | 28 |
| Parents | 21 |
| Policymakers | 21 |
| Students | 15 |
| Community | 8 |
Location
| Canada | 45 |
| Australia | 33 |
| California | 33 |
| United Kingdom | 23 |
| United States | 20 |
| Pennsylvania | 18 |
| United Kingdom (England) | 17 |
| New York | 15 |
| Japan | 14 |
| Michigan | 14 |
| New Jersey | 12 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Benton, Tom; Williamson, Joanna – Research Matters, 2022
Equating methods are designed to adjust between alternate versions of assessments targeting the same content at the same level, with the aim that scores from the different versions can be used interchangeably. The statistical processes used in equating have, however, been extended to statistically "link" assessments that differ, such as…
Descriptors: Statistical Analysis, Equated Scores, Definitions, Alternative Assessment
Gorney, Kylie – ProQuest LLC, 2023
Aberrant behavior refers to any type of unusual behavior that would not be expected under normal circumstances. In educational and psychological testing, such behaviors have the potential to severely bias the aberrant examinee's test score while also jeopardizing the test scores of countless others. It is therefore crucial that aberrant examinees…
Descriptors: Behavior Problems, Educational Testing, Psychological Testing, Test Bias
Hannah E. Luce – ProQuest LLC, 2023
Young children are assessed to meet federal mandates and inform policy decisions, provide teachers with useful information to make instructional decisions and set reasonable learning goals, and facilitate communication with families. While young children are frequently assessed using whole-child assessments which often yield criterion-referenced…
Descriptors: Scores, Norm Referenced Tests, Test Interpretation, Student Evaluation
Barnes, Amy C. – New Directions for Student Leadership, 2021
This article explores the ethical use of assessments in leadership training, education, and development. From the importance of having well-trained facilitators to the consideration of power and social identity in the interpretation of individual results, this article advocates for approaching the use of leadership assessments and inventories with…
Descriptors: Leadership, Measures (Individuals), Ethics, Test Use
Ing, Marsha; Chinen, Starlie; Jackson, Kara; Smith, Thomas M. – Educational Measurement: Issues and Practice, 2021
Despite the ease of accessing a wide range of measures, little attention is given to validity arguments when considering whether to use the measure for a new purpose or in a different context. Making a validity argument has historically focused on the intended interpretation and use. There has been a press to consider both the intended and actual…
Descriptors: Instructional Improvement, Measures (Individuals), Test Validity, Test Interpretation
Marta Godoy-Giménez; Ángel García-Pérez; Fernando Cañadas; Angeles F. Estévez; Pablo Sayans-Jiménez – Autism: The International Journal of Research and Practice, 2024
The broad autism phenotype is the phenotypic expression of the primary characteristics of autism. However, currently available tests do not agree with the two-domain operationalization of broad autism phenotype or autism, and their internal structure has shown instability across applications. This study presents the Broad Autism…
Descriptors: Autism Spectrum Disorders, Genetics, Diagnostic Tests, Foreign Countries
Kseniia Marcq; Johan Braeken – Educational Assessment, Evaluation and Accountability, 2024
Gender differences in item nonresponse are well-documented in high-stakes achievement tests, where female students are shown to omit more items than male students. These gender differences in item nonresponse are often linked to differential risk-taking strategies, with females being risk-averse and unwilling to guess on an item, even if it could…
Descriptors: Secondary School Students, International Assessment, Gender Differences, Response Rates (Questionnaires)
Tri Sedya Febrianti; Siti Fatimah; Yuni Fitriyah; Hanifah Nurhayati – International Journal of Education in Mathematics, Science and Technology, 2024
Assessing students' understanding of circle-related material through subjective tests is effective, though grading these tests can be challenging and often requires technological support. ChatGPT has shown promise in providing reliable and objective evaluations. Many teachers in Indonesia, however, continue to face difficulties integrating…
Descriptors: Artificial Intelligence, Computer Assisted Testing, Scoring, Tests
Lyrica Lucas; Anum Khushal; Robert Mayes; Brian A. Couch; Joseph Dauer – International Journal of Science Education, 2025
Educational reform priorities such as emphasis on quantitative modelling (QM) have positioned undergraduate biology instructors as designers of QM experiences to engage students in authentic science practices that support the development of data-driven and evidence-based reasoning. Yet, little is known about how biology instructors adapt to the…
Descriptors: Undergraduate Students, College Science, Biology, Classroom Observation Techniques
Baldwin, Peter; Clauser, Brian E. – Journal of Educational Measurement, 2022
While score comparability across test forms typically relies on common (or randomly equivalent) examinees or items, innovations in item formats, test delivery, and efforts to extend the range of score interpretation may require a special data collection before examinees or items can be used in this way--or may be incompatible with common examinee…
Descriptors: Scoring, Testing, Test Items, Test Format
Edward Karl Schultz; Emily Smith; Stephanie Zamora-Robles – Journal of the American Academy of Special Education Professionals, 2024
Evaluating students from culturally and linguistically diverse backgrounds (i.e., emergent bilinguals) presents challenges to evaluation teams, as distinguishing between a language disorder and typical second language development is more complex. The skills and knowledge required to do this task often exceed the level of training that evaluators…
Descriptors: Emergent Literacy, Bilingualism, Bilingual Students, Learning Disabilities
Goldhammer, Frank; Hahnel, Carolin; Kroehne, Ulf; Zehner, Fabian – Large-scale Assessments in Education, 2021
International large-scale assessments such as PISA or PIAAC have started to provide public or scientific use files for log data; that is, events, event-related attributes and timestamps of test-takers' interactions with the assessment system. Log data and the process indicators derived from it can be used for many purposes. However, the intended…
Descriptors: International Assessment, Data, Computer Assisted Testing, Validity
Colvin, Kimberly F.; Gorgun, Guher; Zhang, Sijun – Journal of Psychoeducational Assessment, 2020
The Rosenberg Self-Esteem Scale was administered with a 1-4, 1-5, or 0-100 scale to 819 participants, to compare score interpretations across the different versions. A rating scale utility analysis revealed that the categories in the 101-point scale were used inconsistently; based on the analysis, adjacent categories were collapsed resulting in a…
Descriptors: Self Concept Measures, Self Esteem, Test Interpretation, Scores
B. Goecke; S. Weiss; B. Barbot – Journal of Creative Behavior, 2025
The present paper questions the content validity of the eight creativity-related self-report scales available in PISA 2022's context questionnaire and provides a set of considerations for researchers interested in using these indexes. Specifically, we point out some threats to the content validity of these scales (e.g., "creative thinking…
Descriptors: Creativity, Creativity Tests, Questionnaires, Content Validity
Edward J. Golob; Ricardo C. Olayo; Denver M. Y. Brown; Jeffrey R. Mock – Journal of Speech, Language, and Hearing Research, 2024
Purpose: Listening effort is a broad construct, and there is no consensus on how to subdivide listening effort into dimensions. This project focuses on the subjective experience of effortful listening and tests if cognitive workload, mental fatigue, and mood are interrelated dimensions. Method: Two online studies tested young adults (n = 74 and n…
Descriptors: Adults, Psychomotor Skills, Psychomotor Objectives, Listening Comprehension Tests

Direct link
Peer reviewed
