Publication Date
In 2025 | 6 |
Since 2024 | 9 |
Descriptor
Test Reliability | 9 |
Tests | 9 |
Test Validity | 6 |
Scores | 4 |
Evaluation Methods | 3 |
Testing | 3 |
Accuracy | 2 |
Measurement Techniques | 2 |
Psychometrics | 2 |
Scoring Rubrics | 2 |
Student Evaluation | 2 |
More ▼ |
Source
Journal of Educational… | 2 |
Measurement in Physical… | 2 |
Annenberg Institute for… | 1 |
Digital Education Review | 1 |
George W. Bush Institute | 1 |
International Journal of… | 1 |
ProQuest LLC | 1 |
Author
Abdullah Uysal | 1 |
Amery D. Wu | 1 |
Anne Wicks | 1 |
Aroa Otero Rodríguez | 1 |
Ayfer Alper | 1 |
Benjamin W. Domingue | 1 |
Carlos Ayán-Pérez | 1 |
Christopher M. Claude | 1 |
Daniel González-Devesa | 1 |
Jake Stone | 1 |
James G. Soland | 1 |
More ▼ |
Publication Type
Journal Articles | 6 |
Reports - Research | 5 |
Information Analyses | 2 |
Dissertations/Theses -… | 1 |
Reports - Descriptive | 1 |
Audience
Policymakers | 1 |
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Daniel González-Devesa; José Carlos Diz-Gómez; Miguel Adriano Sanchez-Lastra; Aroa Otero Rodríguez; Carlos Ayán-Pérez – Measurement in Physical Education and Exercise Science, 2025
The aim of this study is to examine the available scientific evidence on the reliability and criterion validity of 6-minute run walk field-based test when administered to children and adolescents. Systematic searches were performed in three electronic databases (MEDLINE/PubMed, SPORTDiscuss and Scopus) from their inception until February 2024,…
Descriptors: Child Health, Health Related Fitness, Literature Reviews, Meta Analysis
Joshua B. Gilbert; James G. Soland; Benjamin W. Domingue – Annenberg Institute for School Reform at Brown University, 2025
Value-Added Models (VAMs) are both common and controversial in education policy and accountability research. While the sensitivity of VAMs to model specification and covariate selection is well documented, the extent to which test scoring methods (e.g., mean scores vs. IRT-based scores) may affect VA estimates is less studied. We examine the…
Descriptors: Value Added Models, Tests, Testing, Scoring
Sümeyye Arkan; Sema Tan – International Journal of Assessment Tools in Education, 2025
Teachers' perceptions, attitudes, and opinions about students, curricula, or evaluation methods contribute to the development of students' talents. Thus, researchers often collect data from teachers to identify gifted students, determine educational practices to meet the students' needs and assess gifted education programs. Researchers often…
Descriptors: Talent Identification, Academically Gifted, Evaluation Methods, Measurement Techniques
Yücel Makaraci; Kazim Nas; Kerem Gündüz; Abdullah Uysal; Samuel T. Orange; Juan D. Ruiz-Cárdenas – Measurement in Physical Education and Exercise Science, 2024
The aim was to determine the validity and test-retest reliability of the Sit to Stand App variables (rising time, vertical velocity, and power) for measuring single-leg sit-to-stand (STS) test compared to those derived from ground reaction force data. Twenty-seven female athletes performed the single-leg STS test over three consecutive sessions…
Descriptors: Computer Simulation, Measurement Techniques, Athletics, Physical Fitness
Tugra Karademir Coskun; Ayfer Alper – Digital Education Review, 2024
This study aims to examine the potential differences between teacher evaluations and artificial intelligence (AI) tool-based assessment systems in university examinations. The research has evaluated a wide spectrum of exams including numerical and verbal course exams, exams with different assessment styles (project, test exam, traditional exam),…
Descriptors: Artificial Intelligence, Visual Aids, Video Technology, Tests
Kylie Gorney; Sandip Sinharay – Journal of Educational Measurement, 2025
Although there exists an extensive amount of research on subscores and their properties, limited research has been conducted on categorical subscores and their interpretations. In this paper, we focus on the claim of Feinberg and von Davier that categorical subscores are useful for remediation and instructional purposes. We investigate this claim…
Descriptors: Tests, Scores, Test Interpretation, Alternative Assessment
Anne Wicks; Robin Berkley – George W. Bush Institute, 2025
Assessments are one of the most important--and often misunderstood--elements of education. In most cases, tests are administered by the state as well as by districts and schools. Assessments at each of these levels have distinct purposes, yield different information, and are part of a powerful, coordinated approach to improving student outcomes.…
Descriptors: Student Evaluation, Testing, Tests, Standardized Tests
Christopher M. Claude – ProQuest LLC, 2024
This dissertation comprises three complementary studies that aim to advance the understanding and practice of Individualized Education Programs (IEP) and Present Levels of Academic Achievement and Functional Performance (PLAAFP) development in special education. In the first study, we systematically reviewed empirical research measuring IEP…
Descriptors: Individualized Education Programs, Academic Achievement, Special Education, Measurement
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction