Publication Date
| In 2026 | 0 |
| Since 2025 | 18 |
| Since 2022 (last 5 years) | 65 |
| Since 2017 (last 10 years) | 163 |
| Since 2007 (last 20 years) | 256 |
Descriptor
| Difficulty Level | 394 |
| Test Validity | 394 |
| Test Items | 248 |
| Test Reliability | 199 |
| Test Construction | 156 |
| Foreign Countries | 142 |
| Item Analysis | 95 |
| Multiple Choice Tests | 74 |
| Psychometrics | 63 |
| Item Response Theory | 59 |
| Language Tests | 52 |
| More ▼ | |
Source
Author
| Bejar, Isaac I. | 4 |
| Roid, Gale | 4 |
| Liu, Kimy | 3 |
| Paek, Insu | 3 |
| Schoen, Robert C. | 3 |
| Tindal, Gerald | 3 |
| Weiten, Wayne | 3 |
| Yang, Xiaotong | 3 |
| Alexander, Patricia A. | 2 |
| Baghaei, Purya | 2 |
| Beege, Maik | 2 |
| More ▼ | |
Publication Type
Education Level
Location
| Turkey | 16 |
| Indonesia | 15 |
| Germany | 9 |
| Canada | 7 |
| Japan | 7 |
| Nigeria | 7 |
| Iran | 6 |
| United Kingdom | 4 |
| Australia | 3 |
| California | 3 |
| Chile | 3 |
| More ▼ | |
Laws, Policies, & Programs
| Pell Grant Program | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Janika Saretzki; Rosalie Andrae; Boris Forthmann; Mathias Benedek – Journal of Creative Behavior, 2025
Divergent thinking (DT) ability is widely regarded as a central cognitive capacity underlying creativity, but its assessment is challenged by the fact that DT tasks yield a variable number of responses. Various approaches for the scoring of DT tasks have been proposed, which differ in how responses are evaluated and aggregated within a task. The…
Descriptors: Creative Thinking, Creativity Tests, Scoring, Metacognition
Camilo Vieira; Andrea Vásquez; Federico Meza; Roxana Quintero-Manes; Pedro Godoy – ACM Transactions on Computing Education, 2024
Currently, there is little evidence about how non-English-speaking students learn computer programming. For example, there are few validated assessment instruments to measure the development of programming skills, especially for the Spanish-speaking population. Having valid assessment instruments is essential to identify the difficulties of the…
Descriptors: Programming, Spanish Speaking, Translation, Test Validity
Sherwin E. Balbuena – Online Submission, 2024
This study introduces a new chi-square test statistic for testing the equality of response frequencies among distracters in multiple-choice tests. The formula uses the information from the number of correct answers and wrong answers, which becomes the basis of calculating the expected values of response frequencies per distracter. The method was…
Descriptors: Multiple Choice Tests, Statistics, Test Validity, Testing
Ali Orhan; Inan Tekin; Sedat Sen – International Journal of Assessment Tools in Education, 2025
In this study, it was aimed to translate and adapt the Computational Thinking Multidimensional Test (CTMT) developed by Kang et al. (2023) into Turkish and to investigate its psychometric qualities with Turkish university students. Following the translation procedures of the CTMT with 12 multiple-choice questions developed based on real-life…
Descriptors: Cognitive Tests, Thinking Skills, Computation, Test Validity
Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024
This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…
Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods
Krieglstein, Felix; Beege, Maik; Rey, Günter Daniel; Sanchez-Stockhammer, Christina; Schneider, Sascha – Educational Psychology Review, 2023
According to cognitive load theory, learning can only be successful when instructional materials and procedures are designed in accordance with human cognitive architecture. In this context, one of the biggest challenges is the accurate measurement of the different cognitive load types as these are associated with various activities during…
Descriptors: Test Construction, Test Validity, Questionnaires, Cognitive Processes
Hojung Kim; Changkyung Song; Jiyoung Kim; Hyeyun Jeong; Jisoo Park – Language Testing in Asia, 2024
This study presents a modified version of the Korean Elicited Imitation (EI) test, designed to resemble natural spoken language, and validates its reliability as a measure of proficiency. The study assesses the correlation between average test scores and Test of Proficiency in Korean (TOPIK) levels, examining score distributions among beginner,…
Descriptors: Korean, Test Validity, Test Reliability, Imitation
Jerin Kim; Kent McIntosh – Journal of Positive Behavior Interventions, 2025
We aimed to identify empirically valid cut scores on the positive behavioral interventions and supports (PBIS) Tiered Fidelity Inventory (TFI) through an expert panel process known as bookmarking. The TFI is a measurement tool to evaluate the fidelity of implementation of PBIS. In the bookmark method, experts reviewed all TFI items and item scores…
Descriptors: Positive Behavior Supports, Cutting Scores, Fidelity, Program Evaluation
E.?B. Merki; S.?I. Hofer; A. Vaterlaus; A. Lichtenberger – Physical Review Physics Education Research, 2025
When describing motion in physics, the selection of a frame of reference is crucial. The graph of a moving object can look quite different based on the frame of reference. In recent years, various tests have been developed to assess the interpretation of kinematic graphs, but none of these tests have specifically addressed differences in reference…
Descriptors: Graphs, Motion, Physics, Secondary School Students
Krieglstein, Felix; Beege, Maik; Rey, Günter Daniel; Ginns, Paul; Krell, Moritz; Schneider, Sascha – Educational Psychology Review, 2022
For more than three decades, cognitive load theory has been addressing learning from a cognitive perspective. Based on this instructional theory, design recommendations and principles have been derived to manage the load on working memory while learning. The increasing attention paid to cognitive load theory in educational science quickly…
Descriptors: Cognitive Processes, Difficulty Level, Learning Theories, Test Reliability
Aditya Shah; Ajay Devmane; Mehul Ranka; Prathamesh Churi – Education and Information Technologies, 2024
Online learning has grown due to the advancement of technology and flexibility. Online examinations measure students' knowledge and skills. Traditional question papers include inconsistent difficulty levels, arbitrary question allocations, and poor grading. The suggested model calibrates question paper difficulty based on student performance to…
Descriptors: Computer Assisted Testing, Difficulty Level, Grading, Test Construction
Ober, Teresa M.; Lu, Yikai; Blacklock, Chessley B.; Liu, Cheng; Cheng, Ying – Journal of Psychoeducational Assessment, 2023
We develop and validate a self-report measure of intrinsic and extrinsic cognitive load suitable for measuring the constructs in a variety of learning contexts. Data were collected from three independent samples of college students in the U.S. (N[subscript total]= 513; M[subscript age]= 21.13 years). Kane's (2013) framework was used to validate…
Descriptors: Test Construction, Test Validity, Cognitive Processes, Difficulty Level
Menold, Natalja; Raykov, Tenko – Educational and Psychological Measurement, 2022
The possible dependency of criterion validity on item formulation in a multicomponent measuring instrument is examined. The discussion is concerned with evaluation of the differences in criterion validity between two or more groups (populations/subpopulations) that have been administered instruments with items having differently formulated item…
Descriptors: Test Items, Measures (Individuals), Test Validity, Difficulty Level
Martin Steinbach; Carolin Eitemüller; Marc Rodemer; Maik Walpuski – International Journal of Science Education, 2025
The intricate relationship between representational competence and content knowledge in organic chemistry has been widely debated, and the ways in which representations contribute to task difficulty, particularly in assessment, remain unclear. This paper presents a multiple-choice test instrument for assessing individuals' knowledge of fundamental…
Descriptors: Organic Chemistry, Difficulty Level, Multiple Choice Tests, Fundamental Concepts
Miller, Dan J.; Noble, Prisca; Medlen, Sue; Jones, Karina; Munns, Suzanne L. – Journal of Experimental Education, 2023
The cognitive load imposed by instruction is an important consideration for instructional designers. Theoretical models have traditionally divided total cognitive load into intrinsic, extrinsic, and germane load. The 10-item Cognitive Load Inventory (CLI-10) is designed to measure these three types of cognitive load. It is typically administered…
Descriptors: Psychometrics, Cognitive Processes, Difficulty Level, Factor Analysis

Peer reviewed
Direct link
