Publication Date
| In 2026 | 0 |
| Since 2025 | 2142 |
| Since 2022 (last 5 years) | 12652 |
| Since 2017 (last 10 years) | 33777 |
| Since 2007 (last 20 years) | 68268 |
Descriptor
| Foreign Countries | 30502 |
| Test Validity | 21718 |
| Scores | 18245 |
| Academic Achievement | 16904 |
| Test Construction | 16724 |
| Test Reliability | 15006 |
| Achievement Tests | 14836 |
| Standardized Tests | 14707 |
| Comparative Analysis | 14429 |
| Elementary Secondary Education | 13033 |
| Language Tests | 12545 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 5033 |
| Teachers | 3390 |
| Researchers | 2630 |
| Policymakers | 1229 |
| Administrators | 976 |
| Students | 687 |
| Parents | 325 |
| Counselors | 216 |
| Community | 162 |
| Support Staff | 50 |
| Media Staff | 34 |
| More ▼ | |
Location
| Turkey | 2813 |
| Australia | 2425 |
| Canada | 2269 |
| California | 1851 |
| United States | 1725 |
| Texas | 1613 |
| China | 1577 |
| United Kingdom | 1315 |
| Florida | 1312 |
| United Kingdom (England) | 1202 |
| Germany | 1120 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 121 |
| Meets WWC Standards with or without Reservations | 189 |
| Does not meet standards | 174 |
Dong, Yixiao; Dumas, Denis; Clements, Douglas H.; Day-Hess, Crystal A.; Sarama, Julie – Journal of Psychoeducational Assessment, 2023
Consequential validity (often referred to as "test fairness" in practice) is an essential aspect of educational measurement. This study evaluated the consequential validity of the Research-Based Early Mathematics Assessment (REMA). A sample of 627 children from PreK to second grade was collected using the short form of the REMA. We…
Descriptors: Mathematics Instruction, Mathematics Tests, Item Analysis, Test Items
Ayfer Sayin; Sabiha Bozdag; Mark J. Gierl – International Journal of Assessment Tools in Education, 2023
The purpose of this study is to generate non-verbal items for a visual reasoning test using templated-based automatic item generation (AIG). The fundamental research method involved following the three stages of template-based AIG. An item from the 2016 4th-grade entrance exam of the Science and Art Center (known as BILSEM) was chosen as the…
Descriptors: Test Items, Test Format, Nonverbal Tests, Visual Measures
Walter M. Stroup; Anthony Petrosino; Corey Brady; Karen Duseau – North American Chapter of the International Group for the Psychology of Mathematics Education, 2023
Tests of statistical significance often play a decisive role in establishing the empirical warrant of evidence-based research in education. The results from pattern-based assessment items, as introduced in this paper, are categorical and multimodal and do not immediately support the use of measures of central tendency as typically related to…
Descriptors: Statistical Significance, Comparative Analysis, Research Methodology, Evaluation Methods
Scott E. Grapin; Courtney Plumley; Eric Banilower; Alycia J. Sterenberg Mahon; Laura Craven; Kristen Malzahn; Joan Pasley; Abigail Schwenger; Alison Haas; Okhee Lee – Science Education, 2025
The limited availability of research instruments that reflect the vision of the Next Generation Science Standards (NGSS) restricts the field's understanding of whether and how teachers are making instructional shifts called for by the standards. The need for such instruments is particularly urgent with teachers of multilingual learners (MLs), who…
Descriptors: Test Construction, Questionnaires, Teacher Attitudes, Beliefs
Jerin Kim; Kent McIntosh – Journal of Positive Behavior Interventions, 2025
We aimed to identify empirically valid cut scores on the positive behavioral interventions and supports (PBIS) Tiered Fidelity Inventory (TFI) through an expert panel process known as bookmarking. The TFI is a measurement tool to evaluate the fidelity of implementation of PBIS. In the bookmark method, experts reviewed all TFI items and item scores…
Descriptors: Positive Behavior Supports, Cutting Scores, Fidelity, Program Evaluation
Enrico Gandolfi; Richard E. Ferdig – Educational Technology Research and Development, 2025
Augmented Reality (AR) is increasingly being adopted in education to foster engagement and interest in a variety of subjects and content areas. However, there is a scarcity of instruments to measure the instructional impact of this innovation. This article addresses this gap in two unique ways. First, it presents validation results of the…
Descriptors: Simulated Environment, Measures (Individuals), Rating Scales, Item Response Theory
Jenae D. Thompson; Walter L. Frazier – Journal of Teaching and Learning, 2025
In this study, an instrument was developed to measure an instructor's value and incorporation of intersectionality theory in the classroom. Through a Delphi study, a list of items was devised, and then a pilot study was conducted to collect responses from 161 participants. The result is the development of the Intersectionality Pedagogy Scale, a…
Descriptors: Intersectionality, Measures (Individuals), Test Construction, Educational Practices
Hanif Akhtar; Retno Firdiyanti – Journal of Psychoeducational Assessment, 2025
Psychometric Properties of the Scale of Positive and Negative Experience (SPANE) have been extensively evaluated in numerous countries, but not in Indonesia. This study investigated factor structure, reliability, measurement invariance, and validity of SPANE scores among a sample of Indonesian university students (N = 405). Multiple measurement…
Descriptors: Foreign Countries, Affective Measures, Psychometrics, Factor Structure
Feifei Wang; Alan C. K. Cheung; Ching Sing Chai; Jin Liu – Education and Information Technologies, 2025
As learners are able to perceive interactivity when interacting with instructors or peer learners in traditional learning environments, learners are similarly able to perceive interactivity when interacting with artificial intelligence (AI) in AI-supported learning environments. Advancements in AI, such as generative AI including ChatGPT and…
Descriptors: Test Construction, Test Validity, Interaction, Artificial Intelligence
Nesreen Fathi Mahmoud; Zeinab Mohammed; Hassnaa Othman Mohammed; Alshimaa Mohsen Mohamed Lotfy – Journal of Autism and Developmental Disorders, 2025
Children with developmental disabilities have different feeding and swallowing problems. The purposes of the present study were to develop an Arabic version of the FHI-C and to evaluate its validity, consistency, and reliability in Arabic children with developmental disabilities for assessing how feeding and swallowing problems impair the…
Descriptors: Test Validity, Test Reliability, Media Adaptation, Young Children
Jaime B. Bunga – European Journal of Educational Management, 2025
This study explores the implementation of Differentiated Instruction (DI) in Philippine multigrade classrooms and develops a tool to assess teacher proficiency in DI. Employing an exploratory sequential mixed-method design, the qualitative phase included focus group discussions with eight multigrade teachers, capturing their experiences and…
Descriptors: Foreign Countries, Individualized Instruction, Multigraded Classes, Teacher Competencies
Anirudhan Badrinath; Zachary Pardos – Journal of Educational Data Mining, 2025
Bayesian Knowledge Tracing (BKT) is a well-established model for formative assessment, with optimization typically using expectation maximization, conjugate gradient descent, or brute force search. However, one of the flaws of existing optimization techniques for BKT models is convergence to undesirable local minima that negatively impact…
Descriptors: Bayesian Statistics, Intelligent Tutoring Systems, Problem Solving, Audience Response Systems
Santi Lestari – Research Matters, 2025
The ability to draw visual representations such as diagrams and graphs is considered fundamental to science learning. Science exams therefore often include questions which require students to draw a visual representation, or to augment a partially provided one. The design features of such questions (e.g., layout of diagrams, amount of answer…
Descriptors: Science Education, Secondary Education, Visual Aids, Foreign Countries
Li Wang; Xin Qi; Ziyan Meng; Meiyu Xiang; Zhuoqing Li; Sitong Zhang; Longyun Hu; Hoyee W. Hirai; Carol K. S. To; Patrick C. M. Wong – Journal of Speech, Language, and Hearing Research, 2025
Purpose: Assessing social communication and measuring its changes among young autistic children presents significant challenges, particularly when tracking intervention effects within short timeframes. Existing measures, mostly validated in Western contexts, may not be suitable for culturally diverse populations. Addressing this gap, the Social…
Descriptors: Autism Spectrum Disorders, Preschool Children, Interpersonal Communication, Communication Skills
Running out of Time: Leveraging Process Data to Identify Students Who May Benefit from Extended Time
Burhan Ogut; Ruhan Circi; Huade Huo; Juanita Hicks; Michelle Yin – International Electronic Journal of Elementary Education, 2025
This study explored the effectiveness of extended time (ET) accommodations in the 2017 NAEP Grade 8 Mathematics assessment to enhance educational equity. Analyzing NAEP process data through an XGBoost model, we examined if early interactions with assessment items could predict students' likelihood of requiring ET by identifying those who received…
Descriptors: Identification, Testing Accommodations, National Competency Tests, Equal Education

Peer reviewed
Direct link
