Publication Date
| In 2026 | 0 |
| Since 2025 | 2142 |
| Since 2022 (last 5 years) | 12652 |
| Since 2017 (last 10 years) | 33777 |
| Since 2007 (last 20 years) | 68268 |
Descriptor
| Foreign Countries | 30502 |
| Test Validity | 21718 |
| Scores | 18245 |
| Academic Achievement | 16904 |
| Test Construction | 16724 |
| Test Reliability | 15006 |
| Achievement Tests | 14836 |
| Standardized Tests | 14707 |
| Comparative Analysis | 14429 |
| Elementary Secondary Education | 13033 |
| Language Tests | 12545 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 5033 |
| Teachers | 3390 |
| Researchers | 2630 |
| Policymakers | 1229 |
| Administrators | 976 |
| Students | 687 |
| Parents | 325 |
| Counselors | 216 |
| Community | 162 |
| Support Staff | 50 |
| Media Staff | 34 |
| More ▼ | |
Location
| Turkey | 2813 |
| Australia | 2425 |
| Canada | 2269 |
| California | 1851 |
| United States | 1725 |
| Texas | 1613 |
| China | 1577 |
| United Kingdom | 1315 |
| Florida | 1312 |
| United Kingdom (England) | 1202 |
| Germany | 1120 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 121 |
| Meets WWC Standards with or without Reservations | 189 |
| Does not meet standards | 174 |
Musa Adekunle Ayanwale; Mdutshekelwa Ndlovu – Journal of Pedagogical Research, 2024
The COVID-19 pandemic has had a significant impact on high-stakes testing, including the national benchmark tests in South Africa. Current linear testing formats have been criticized for their limitations, leading to a shift towards Computerized Adaptive Testing [CAT]. Assessments with CAT are more precise and take less time. Evaluation of CAT…
Descriptors: Adaptive Testing, Benchmarking, National Competency Tests, Computer Assisted Testing
Walter Araya Garita; José Alejandro Fallas Godínez – Research in Pedagogy, 2024
The English Diagnostic Test aims to assess reading comprehension skills for first-year students at the University of Costa Rica. In 2022, this test consisted of four instruments with 55 items. Instruments were based on an academic reading, following the criteria B2+ or C1 level according to Common European Framework of Reference for languages.…
Descriptors: Foreign Countries, Language Tests, Diagnostic Tests, Educational Diagnosis
Shear, Benjamin R. – Journal of Educational Measurement, 2023
Large-scale standardized tests are regularly used to measure student achievement overall and for student subgroups. These uses assume tests provide comparable measures of outcomes across student subgroups, but prior research suggests score comparisons across gender groups may be complicated by the type of test items used. This paper presents…
Descriptors: Gender Bias, Item Analysis, Test Items, Achievement Tests
Muh. Fitrah; Anastasia Sofroniou; Ofianto; Loso Judijanto; Widihastuti – Journal of Education and e-Learning Research, 2024
This research uses Rasch model analysis to identify the reliability and separation index of an integrated mathematics test instrument with a cultural architecture structure in measuring students' mathematical thinking abilities. The study involved 357 students from six eighth-grade public junior high schools in Bima. The selection of schools was…
Descriptors: Mathematics Tests, Item Response Theory, Test Reliability, Indexes
Christopher L. Payten; Kelly A. Weir; Catherine J. Madill – International Journal of Language & Communication Disorders, 2024
Background: Published best-practice guidelines and standardized protocols for voice assessment recommend multidisciplinary evaluation utilizing a comprehensive range of clinical measures. Previous studies report variations in assessment practices when compared with these guidelines. Aims: To provide an up-to-date evaluation of current global…
Descriptors: Voice Disorders, Speech Language Pathology, Allied Health Personnel, Auditory Tests
Belzak, William C. M. – Educational Measurement: Issues and Practice, 2023
Test developers and psychometricians have historically examined measurement bias and differential item functioning (DIF) across a single categorical variable (e.g., gender), independently of other variables (e.g., race, age, etc.). This is problematic when more complex forms of measurement bias may adversely affect test responses and, ultimately,…
Descriptors: Test Bias, High Stakes Tests, Artificial Intelligence, Test Items
Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2023
Traditional estimators of reliability such as coefficients alpha, theta, omega, and rho (maximal reliability) are prone to give radical underestimates of reliability for the tests common when testing educational achievement. These tests are often structured by widely deviating item difficulties. This is a typical pattern where the traditional…
Descriptors: Test Reliability, Achievement Tests, Computation, Test Items
Sasima Charubusp; Orawan Wangsombat; Napatacha Sriwichai; Chanida Phongnapharuk – PASAA: Journal of Language Teaching and Learning in Thailand, 2025
Washback refers to the impact of a test on instruction and learning, with high-stakes tests exerting both positive and negative effects. This study examined the washback of an English exit exam (EEE) on English language learning at a Thai university where English-medium instruction is used in most academic disciplines. The EEE is an in-house…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Language Tests
André A. Rupp; Laura Pinsonneault – National Center for the Improvement of Educational Assessment, 2025
State education agencies are sitting on rich repositories of quantitative and qualitative assessment data. This document is designed to provide a conceptual framework and implementation guidance that can help agency leadership leverage and interrogate student performance data in systematic ways for reporting, outreach, and planning purposes. The…
Descriptors: Evaluation Methods, Educational Assessment, Achievement Tests, College Entrance Examinations
Mohammad Nayef Ayasrah; Mohamad Ahmad Saleem Khasawneh; Mazen Omar Almulla; Amoura Hassan Aboutaleb – Journal of Computer Assisted Learning, 2025
Background: One area that has been dramatically changed by artificial intelligence (AI) is educational environments. Chatbots, Recommender Systems, Adaptive Learning Systems and Large Language Models have been emerging as practical tools for facilitating learning. However, using such tools appropriately is challenging. In this regard, the…
Descriptors: Test Construction, Test Validity, Test Reliability, Rating Scales
Kartianom Kartianom; Heri Retnawati; Kana Hidayati – Journal of Pedagogical Research, 2024
Conducting a fair test is important for educational research. Unfair assessments can lead to gender disparities in academic achievement, ultimately resulting in disparities in opportunities, wages, and career choice. Differential Item Function [DIF] analysis is presented to provide evidence of whether the test is truly fair, where it does not harm…
Descriptors: Foreign Countries, Test Bias, Item Response Theory, Test Theory
Viola Merhof; Caroline M. Böhm; Thorsten Meiser – Educational and Psychological Measurement, 2024
Item response tree (IRTree) models are a flexible framework to control self-reported trait measurements for response styles. To this end, IRTree models decompose the responses to rating items into sub-decisions, which are assumed to be made on the basis of either the trait being measured or a response style, whereby the effects of such person…
Descriptors: Item Response Theory, Test Interpretation, Test Reliability, Test Validity
Do-Hong Kim; Chuang Wang; Thi Nhu Ngoc Truong – Language Teaching Research, 2024
Researchers and practitioners in the field of second language acquisition have come to realize the importance of non-cognitive skills such as self-efficacy and self-regulation in students' learning of a second language. However, there has been limited systematic research on such measures in the second language context and the validity and…
Descriptors: Psychometrics, Test Content, Self Efficacy, English Language Learners
Marilena Z. Leana-Tascilar – Cogent Education, 2024
This study aimed to develop a comprehensive tool to assess underachievement in gifted students, incorporating input from parents, teachers, and students themselves. A total of 285 participants, including 95 gifted students, their parents, and teachers, were involved in the study. The results have revealed a four-factor structure for the Gifted…
Descriptors: Psychometrics, Academic Achievement, Underachievement, Academically Gifted
Sean N. Weeks; Tyler L. Renshaw; Allysia A. Rainey; Aubrey Hiatt – Journal of Emotional and Behavioral Disorders, 2024
Internalizing and externalizing problems are common targets for school mental health screening. Prior research supports the interpretation of scores from the Youth Internalizing Problems Screener (YIPS) and the Youth Externalizing Problems Screener (YEPS), which were developed separately yet intended as companion measures. We extended previous…
Descriptors: Adolescents, Screening Tests, Behavior Problems, Mental Health

Peer reviewed
Direct link
