Publication Date
In 2025 | 9 |
Since 2024 | 26 |
Since 2021 (last 5 years) | 94 |
Since 2016 (last 10 years) | 255 |
Since 2006 (last 20 years) | 447 |
Descriptor
Test Validity | 953 |
Scoring | 675 |
Test Reliability | 560 |
Test Construction | 316 |
Scoring Rubrics | 166 |
Testing | 152 |
Test Items | 122 |
Psychometrics | 113 |
Evaluation Methods | 112 |
Scores | 111 |
Higher Education | 110 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Practitioners | 30 |
Researchers | 21 |
Administrators | 11 |
Teachers | 10 |
Policymakers | 9 |
Students | 3 |
Counselors | 1 |
Parents | 1 |
Location
New York | 16 |
Turkey | 12 |
United States | 12 |
California | 11 |
Australia | 9 |
Canada | 8 |
Nebraska | 8 |
United Kingdom | 7 |
China | 6 |
Florida | 6 |
Pennsylvania | 6 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Steven Holtzman; Jonathan Steinberg; Jonathan Weeks; Christopher Robertson; Jessica Findley; David Klieger – ETS Research Report Series, 2024
At a time when institutions of higher education are exploring alternatives to traditional admissions testing, institutions are also seeking to better support students and prepare them for academic success. Under such an engaged model, one may seek to measure not just the accumulated knowledge and skills that students would bring to a new academic…
Descriptors: Law Schools, College Applicants, Legal Education (Professions), College Entrance Examinations
Alexandra Jackson; Elise Barrella; Cheryl Bodnar – Journal of Engineering Education, 2024
Background: Concept maps are a valid assessment tool to explore student understanding of diverse topics. Many types of academic programs have integrated concept mapping into their courses, resulting in various activities and scoring methods to understand student perceptions. Purpose: Few prior reviews of concept mapping have addressed their use…
Descriptors: Engineering Education, Concept Mapping, Scoring Rubrics, Evaluation Methods
Michael D. Wray; Matthew R. Reynolds – Journal of Psychoeducational Assessment, 2025
The KeyMath-3 Diagnostic Assessment (KM-3) is an individually-administered math assessment used in educational placement and diagnostic decisions. It includes 10 subtests making up Basic Concepts, Operations, and Applications indexes and a "Total Test" composite that measures overall math ability. Here, covariances among subtests from…
Descriptors: Diagnostic Tests, Mathematics Tests, Arithmetic, Factor Analysis
National Institute for Excellence in Teaching, 2023
Aspiring teachers must develop an in-depth understanding of high-quality instructional practices. In order to prepare, instruct, and coach aspiring teachers, the National Institute for Excellence in Teaching (NIET) has developed a the NIET Aspiring Teacher Rubric (ATR) based on principles of excellence in instruction. This research brief…
Descriptors: Scoring Rubrics, Preservice Teachers, Test Construction, Test Validity
Großmann, Leroy; Krüger, Dirk – Science Education, 2024
Lesson planning is a core part of teachers' professional competence. Written lesson plans play a significant role in science teacher education as a preparation for demonstration lessons during the final teacher certification exam. However, the few existing scoring rubrics on lesson plans are not particularly theoretically sound and are barely…
Descriptors: Science Instruction, Lesson Plans, Planning, Scoring Rubrics
Aloisi, Cesare – European Journal of Education, 2023
This article considers the challenges of using artificial intelligence (AI) and machine learning (ML) to assist high-stakes standardised assessment. It focuses on the detrimental effect that even state-of-the-art AI and ML systems could have on the validity of national exams of secondary education, and how lower validity would negatively affect…
Descriptors: Standardized Tests, Test Validity, Credibility, Algorithms
Marcos Jiménez; María Zapata-Cáceres; Marcos Román-González; Gregorio Robles; Jesús Moreno-León; Estefanía Martín-Barroso – Journal of Science Education and Technology, 2024
Computational thinking (CT) is a multidimensional term that encompasses a wide variety of problem-solving skills related to the field of computer science. Unfortunately, standardized, valid, and reliable methods to assess CT skills in preschool children are lacking, compromising the reliability of the results reported in CT interventions. To…
Descriptors: Computation, Thinking Skills, Student Evaluation, Preschool Children
Huawei, Shi; Aryadoust, Vahid – Education and Information Technologies, 2023
Automated writing evaluation (AWE) systems are developed based on interdisciplinary research and technological advances such as natural language processing, computer sciences, and latent semantic analysis. Despite a steady increase in research publications in this area, the results of AWE investigations are often mixed, and their validity may be…
Descriptors: Writing Evaluation, Writing Tests, Computer Assisted Testing, Automation
Tiffany Wu; Christina Weiland; Meghan McCormick; JoAnn Hsueh; Catherine Snow; Jason Sachs – Grantee Submission, 2024
The Hearts and Flowers (H&F) task is a computerized executive functioning (EF) assessment that has been used to measure EF from early childhood to adulthood. It provides data on accuracy and reaction time (RT) across three different task blocks (hearts, flowers, and mixed). However, there is a lack of consensus in the field on how to score the…
Descriptors: Scoring, Executive Function, Kindergarten, Young Children
Susan K. Johnsen – Gifted Child Today, 2024
The author provides a checklist for educators who are selecting technically adequate tests for identifying and referring students for gifted education services and programs. The checklist includes questions related to how the test was normed, reliability and validity studies as well as questions related to types of scores, administration, and…
Descriptors: Test Selection, Academically Gifted, Gifted Education, Test Validity
Marcelo Fernando Rauber; Christiane Gresse von Wangenheim; Pedro Alberto Barbetta; Adriano Ferreti Borgatto; Ramon Mayor Martins; Jean Carlo Rossa Hauck – Informatics in Education, 2024
The insertion of Machine Learning (ML) in everyday life demonstrates the importance of popularizing an understanding of ML already in school. Accompanying this trend arises the need to assess the students' learning. Yet, so far, few assessments have been proposed, most lacking an evaluation. Therefore, we evaluate the reliability and validity of…
Descriptors: Artificial Intelligence, Measures (Individuals), Test Reliability, Test Validity
Han, Chao – Language Testing, 2022
Over the past decade, testing and assessing spoken-language interpreting has garnered an increasing amount of attention from stakeholders in interpreter education, professional certification, and interpreting research. This is because in these fields assessment results provide a critical evidential basis for high-stakes decisions, such as the…
Descriptors: Translation, Language Tests, Testing, Evaluation Methods
Reuben S. Asempapa; Doris Lee – Discover Education, 2025
Across the world, standards and practices for preparing teachers of mathematics emphasize the importance of math modeling (MM) in developing students' mathematical thinking. The aim of this research study was to develop the Mathematical Modeling Knowledge Scale (MAMKS), capable of determining preservice teachers' (PSTs') knowledge of MM. The study…
Descriptors: Preservice Teachers, Preservice Teacher Education, Mathematics Education, Mathematics Curriculum
Kylie Gorney; Sandip Sinharay – Journal of Educational Measurement, 2025
Although there exists an extensive amount of research on subscores and their properties, limited research has been conducted on categorical subscores and their interpretations. In this paper, we focus on the claim of Feinberg and von Davier that categorical subscores are useful for remediation and instructional purposes. We investigate this claim…
Descriptors: Tests, Scores, Test Interpretation, Alternative Assessment
Darling-Aduana, Jennifer – Educational Technology Research and Development, 2021
Researchers tout digital learning as a tool that can increase the authenticity of student learning and assessment tasks but lack a psychometrically valid instrument to test this hypothesis. Further, there are several complementary definitions of authentic work, versus a single agreed upon definition, presented in academic literature. I synthesized…
Descriptors: Test Construction, Test Validity, Authentic Learning, Online Courses