Publication Date
| In 2026 | 0 |
| Since 2025 | 200 |
| Since 2022 (last 5 years) | 1070 |
| Since 2017 (last 10 years) | 2580 |
| Since 2007 (last 20 years) | 4941 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 225 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 65 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Hongfei Ye; Jian Xu; Danqing Huang; Meng Xie; Jinming Guo; Junrui Yang; Haiwei Bao; Mingzhi Zhang; Ce Zheng – Discover Education, 2025
This study evaluates Large language models (LLMs)' performance on Chinese Postgraduate Medical Entrance Examination (CPGMEE) as well as the hallucinations produced by LLMs and investigate their implications for medical education. We curated 10 trials of mock CPGMEE to evaluate the performances of 4 LLMs (GPT-4.0, ChatGPT, QWen 2.1 and Ernie 4.0).…
Descriptors: College Entrance Examinations, Foreign Countries, Computational Linguistics, Graduate Medical Education
Herbert Kalthoff; Fabian Koelsch – British Journal of Sociology of Education, 2025
University examinations categorise students according to their individual achievements determined by teaching staff. This procedure serves the elicitation and certification of student knowledge and thus reproduces academic hierarchies. Drawing on empirical evidence from ethnographic fieldwork in Engineering and History departments, this article…
Descriptors: College Students, Student Evaluation, Testing, History Instruction
Guher Gorgun; Okan Bulut – Educational Measurement: Issues and Practice, 2025
Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for…
Descriptors: Artificial Intelligence, Quality Control, Technology Uses in Education, Automation
Apichat Khamboonruang – Language Testing in Asia, 2025
Chulalongkorn University Language Institute (CULI) test was developed as a local standardised test of English for professional and international communication. To ensure that the CULI test fulfils its intended purposes, this study employed Kane's argument-based validation and Rasch measurement approaches to construct the validity argument for the…
Descriptors: Universities, Second Language Learning, Second Language Instruction, Language Tests
Ikmanisa Khairati; L. Lufri; Muhyiatul Fadilah – Journal of Biological Education Indonesia (Jurnal Pendidikan Biologi Indonesia), 2025
Education for Sustainable Development (ESD) serves as a key accelerator for achieving the Sustainable Development Goals (SDGs), emphasizing systems thinking as an essential competency that must be cultivated in the learning process. This study investigates students' systems thinking skills within the ESD framework through assessments on…
Descriptors: Systems Approach, Thinking Skills, Sustainable Development, Biology
Patricia Hadler – Sociological Methods & Research, 2025
Probes are follow-ups to survey questions used to gain insights on respondents' understanding of and responses to these questions. They are usually administered as open-ended questions, primarily in the context of questionnaire pretesting. Due to the decreased cost of data collection for open-ended questions in web surveys, researchers have argued…
Descriptors: Online Surveys, Discovery Processes, Test Items, Data Collection
Jeff Allen; Jay Thomas; Stacy Dreyer; Scott Johanningmeier; Dana Murano; Ty Cruce; Xin Li; Edgar Sanchez – ACT Education Corp., 2025
This report describes the process of developing and validating the enhanced ACT. The report describes the changes made to the test content and the processes by which these design decisions were implemented. The authors describe how they shared the overall scope of the enhancements, including the initial blueprints, with external expert panels,…
Descriptors: College Entrance Examinations, Testing, Change, Test Construction
Albert M. Jimenez; Nicholas Clegorne; Sheryl Croft; David G. Buckman – Educational Planning, 2025
This quantitative study was designed to determine whether the use of graphical aids in standardized mathematics testing is effective in lessening the achievement gap between English Language Learner (ELL) students and their non-ELL counterparts for middle-grade aged students. The data used for this study include data from 2,659 students and come…
Descriptors: Middle School Students, Mathematics Instruction, Mathematics Achievement, English Learners
José Ventura-León; Cristopher Lino-Cruz; Shirley Tocto-Muñoz; Andy Rick Sánchez-Villena – Journal of Psychoeducational Assessment, 2025
Academic and occupational success requires social intelligence, the ability to comprehend, and manage interpersonal connections. This research aims to assess and improve the Tromsø Social Intelligence Scale (TSIS) for Peruvian university students, focusing on cultural adaptability, reliability, and validity. Participants included 973 university…
Descriptors: Factor Analysis, Intelligence Tests, Test Items, Test Length
Nese Öztürk Gübes – International Journal of Assessment Tools in Education, 2025
The Trends in International Mathematics and Science Study (TIMSS) was administered via computer, eTIMSS, for the first time in 2019. The purpose of this study was to investigate item block position and item format effect on eighth grade mathematics item easiness in low- and high-achieving countries of eTIMSS 2019. Item responses from Chile, Qatar,…
Descriptors: Foreign Countries, International Assessment, Achievement Tests, Mathematics Achievement
Andreas Frey; Christoph König; Aron Fink – Journal of Educational Measurement, 2025
The highly adaptive testing (HAT) design is introduced as an alternative test design for the Programme for International Student Assessment (PISA). The principle of HAT is to be as adaptive as possible when selecting items while accounting for PISA's nonstatistical constraints and addressing issues concerning PISA such as item position effects.…
Descriptors: Adaptive Testing, Test Construction, Alternative Assessment, Achievement Tests
Hyo Jeong Shin; Christoph König; Frederic Robin; Andreas Frey; Kentaro Yamamoto – Journal of Educational Measurement, 2025
Many international large-scale assessments (ILSAs) have switched to multistage adaptive testing (MST) designs to improve measurement efficiency in measuring the skills of the heterogeneous populations around the world. In this context, previous literature has reported the acceptable level of model parameter recovery under the MST designs when the…
Descriptors: Robustness (Statistics), Item Response Theory, Adaptive Testing, Test Construction
Xiuxiu Tang; Yi Zheng; Tong Wu; Kit-Tai Hau; Hua-Hua Chang – Journal of Educational Measurement, 2025
Multistage adaptive testing (MST) has been recently adopted for international large-scale assessments such as Programme for International Student Assessment (PISA). MST offers improved measurement efficiency over traditional nonadaptive tests and improved practical convenience over single-item-adaptive computerized adaptive testing (CAT). As a…
Descriptors: Reaction Time, Test Items, Achievement Tests, Foreign Countries
Venessa F. Manna; Shuhong Li; Spiros Papageorgiou; Lixiong Gu – ETS Research Report Series, 2025
This technical manual describes the purpose and intended uses of the TOEFL iBT test, its target test-taker population, and relevant language use domains. The test design and scoring procedures are presented first, followed by a research agenda intended to support the interpretation and use of test scores. Given the updates to the test starting…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Test Construction
Militsa G. Ivanova; Hanna Eklöf; Michalis P. Michaelides – Journal of Applied Testing Technology, 2025
Digital administration of assessments allows for the collection of process data indices, such as response time, which can serve as indicators of rapid-guessing and examinee test-taking effort. Setting a time threshold is essential to distinguish effortful from effortless behavior using item response times. Threshold identification methods may…
Descriptors: Test Items, Computer Assisted Testing, Reaction Time, Achievement Tests

Peer reviewed
Direct link
