Publication Date
| In 2026 | 0 |
| Since 2025 | 2142 |
| Since 2022 (last 5 years) | 12652 |
| Since 2017 (last 10 years) | 33777 |
| Since 2007 (last 20 years) | 68268 |
Descriptor
| Foreign Countries | 30502 |
| Test Validity | 21718 |
| Scores | 18245 |
| Academic Achievement | 16904 |
| Test Construction | 16724 |
| Test Reliability | 15006 |
| Achievement Tests | 14836 |
| Standardized Tests | 14707 |
| Comparative Analysis | 14429 |
| Elementary Secondary Education | 13033 |
| Language Tests | 12545 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 5033 |
| Teachers | 3390 |
| Researchers | 2630 |
| Policymakers | 1229 |
| Administrators | 976 |
| Students | 687 |
| Parents | 325 |
| Counselors | 216 |
| Community | 162 |
| Support Staff | 50 |
| Media Staff | 34 |
| More ▼ | |
Location
| Turkey | 2813 |
| Australia | 2425 |
| Canada | 2269 |
| California | 1851 |
| United States | 1725 |
| Texas | 1613 |
| China | 1577 |
| United Kingdom | 1315 |
| Florida | 1312 |
| United Kingdom (England) | 1202 |
| Germany | 1120 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 121 |
| Meets WWC Standards with or without Reservations | 189 |
| Does not meet standards | 174 |
Achmad Rante Suparman; Eli Rohaeti; Sri Wening – Journal on Efficiency and Responsibility in Education and Science, 2024
This study focuses on developing a five-tier chemical diagnostic test based on a computer-based test with 11 assessment categories with an assessment score from 0 to 10. A total of 20 items produced were validated by education experts, material experts, measurement experts, and media experts, and an average index of the Aiken test > 0.70 was…
Descriptors: Chemistry, Diagnostic Tests, Computer Assisted Testing, Credits
Lawrence T. DeCarlo – Educational and Psychological Measurement, 2024
A psychological framework for different types of items commonly used with mixed-format exams is proposed. A choice model based on signal detection theory (SDT) is used for multiple-choice (MC) items, whereas an item response theory (IRT) model is used for open-ended (OE) items. The SDT and IRT models are shown to share a common conceptualization…
Descriptors: Test Format, Multiple Choice Tests, Item Response Theory, Models
Zhiqiang Yang; Chengyuan Yu – Asia Pacific Education Review, 2025
This study investigated the test fairness of the translation section of a large-scale English test in China by examining its Differential Test Functioning (DTF) and Differential Item Functioning (DIF) across gender and major. Regarding DTF, the entire translation section exhibits partial strong measurement invariance across female and male…
Descriptors: Multiple Choice Tests, Test Items, Scoring, Translation
Kylie Gorney; Sandip Sinharay – Journal of Educational Measurement, 2025
Although there exists an extensive amount of research on subscores and their properties, limited research has been conducted on categorical subscores and their interpretations. In this paper, we focus on the claim of Feinberg and von Davier that categorical subscores are useful for remediation and instructional purposes. We investigate this claim…
Descriptors: Tests, Scores, Test Interpretation, Alternative Assessment
Yue Zhou; Yongcan Liu – International Journal of Bilingual Education and Bilingualism, 2025
This paper reports on the process of developing an original framework for conceptualising and measuring language learner well-being within the context of heritage language (HL) learning. Drawing on a quantitative validation study with 545 young Chinese heritage language (CHL) learners, aged 7-18, in the UK, this paper presents an empirically…
Descriptors: Bilingualism, Language Tests, Test Construction, Language Acquisition
Apichat Khamboonruang – Language Testing in Asia, 2025
Chulalongkorn University Language Institute (CULI) test was developed as a local standardised test of English for professional and international communication. To ensure that the CULI test fulfils its intended purposes, this study employed Kane's argument-based validation and Rasch measurement approaches to construct the validity argument for the…
Descriptors: Universities, Second Language Learning, Second Language Instruction, Language Tests
Jeff Allen; Jay Thomas; Stacy Dreyer; Scott Johanningmeier; Dana Murano; Ty Cruce; Xin Li; Edgar Sanchez – ACT Education Corp., 2025
This report describes the process of developing and validating the enhanced ACT. The report describes the changes made to the test content and the processes by which these design decisions were implemented. The authors describe how they shared the overall scope of the enhancements, including the initial blueprints, with external expert panels,…
Descriptors: College Entrance Examinations, Testing, Change, Test Construction
Nese Öztürk Gübes – International Journal of Assessment Tools in Education, 2025
The Trends in International Mathematics and Science Study (TIMSS) was administered via computer, eTIMSS, for the first time in 2019. The purpose of this study was to investigate item block position and item format effect on eighth grade mathematics item easiness in low- and high-achieving countries of eTIMSS 2019. Item responses from Chile, Qatar,…
Descriptors: Foreign Countries, International Assessment, Achievement Tests, Mathematics Achievement
R. Ramadhani; Eni Nuraeni; Widi Purwianingsih – Journal of Biological Education Indonesia (Jurnal Pendidikan Biologi Indonesia), 2025
Numeracy literacy is a crucial skill for understanding, using, and communicating effectively with numbers, facts, and mathematical procedures in various real-world situations. The development of numeracy literacy is crucial because it is one of the essential prerequisites for life skills in the 21st century. This study aims to develop a valid…
Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students
Andreas Frey; Christoph König; Aron Fink – Journal of Educational Measurement, 2025
The highly adaptive testing (HAT) design is introduced as an alternative test design for the Programme for International Student Assessment (PISA). The principle of HAT is to be as adaptive as possible when selecting items while accounting for PISA's nonstatistical constraints and addressing issues concerning PISA such as item position effects.…
Descriptors: Adaptive Testing, Test Construction, Alternative Assessment, Achievement Tests
Xiuxiu Tang; Yi Zheng; Tong Wu; Kit-Tai Hau; Hua-Hua Chang – Journal of Educational Measurement, 2025
Multistage adaptive testing (MST) has been recently adopted for international large-scale assessments such as Programme for International Student Assessment (PISA). MST offers improved measurement efficiency over traditional nonadaptive tests and improved practical convenience over single-item-adaptive computerized adaptive testing (CAT). As a…
Descriptors: Reaction Time, Test Items, Achievement Tests, Foreign Countries
Mehmet Emin Ören; Servet Atik – International Journal of Assessment Tools in Education, 2025
In this study, it was aimed to adapt the DigiFuehr 2.0 Scale developed by Claassen et al. (2023) to Turkish and to conduct validity and reliability studies on three groups of participants consisting of teachers. In the study, exploratory and confirmatory factor analyses were performed in line with translation study, linguistic application, and…
Descriptors: Test Reliability, Test Validity, Test Construction, Translation
Joseph F. Mirabelli; Eileen M. Johnson; Sara R. Vohra; Jeanne L. Sanders; Karin J. Jensen – International Journal of STEM Education, 2025
Background: Undergraduate engineering students report increased rates of mental health distress. Evidence suggests that these students experience high stress, which can perpetuate mental health challenges. Further, engineering students may engage in help-seeking and self-care activities more rarely than students in other disciplines. We…
Descriptors: Undergraduate Students, Engineering Education, Mental Health, Stress Variables
Norton Kitanishi; Daniela Bordini; Marcos V. V. Ribeiro; Cristiane Silvestre Paula; Helena Brentani; Joana Portelese; Pamela J. Surkan; Silvia S. Martins; Jair de Jesus Mari; Paola Matiko Martins Okuda; Sheila C. Caetano – Autism: The International Journal of Research and Practice, 2025
Early identification of autism spectrum disorder through cost-effective screening is crucial in low- and middle-income countries. The Child Behavior Checklist 1.5-5, using the Autism Spectrum Problems and Withdrawn Syndrome subscales, has potential as a level 1 autism spectrum disorder screening tool, though its construct validity in low- and…
Descriptors: Autism Spectrum Disorders, Disability Identification, Screening Tests, Test Validity
Jiawei Xiong; George Engelhard; Allan S. Cohen – Measurement: Interdisciplinary Research and Perspectives, 2025
It is common to find mixed-format data results from the use of both multiple-choice (MC) and constructed-response (CR) questions on assessments. Dealing with these mixed response types involves understanding what the assessment is measuring, and the use of suitable measurement models to estimate latent abilities. Past research in educational…
Descriptors: Responses, Test Items, Test Format, Grade 8

Peer reviewed
Direct link
