Publication Date
In 2025 | 2 |
Since 2024 | 9 |
Since 2021 (last 5 years) | 41 |
Since 2016 (last 10 years) | 101 |
Since 2006 (last 20 years) | 213 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Researchers | 16 |
Practitioners | 1 |
Location
Turkey | 8 |
United Kingdom | 7 |
Massachusetts | 6 |
Netherlands | 6 |
Australia | 5 |
China | 5 |
Canada | 4 |
Germany | 4 |
Indonesia | 4 |
Malaysia | 4 |
New York | 4 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 2 |
Assessments and Surveys
What Works Clearinghouse Rating
Beyza Aksu Dunya; Stefanie Wind – International Journal of Testing, 2025
We explored the practicality of relatively small item pools in the context of low-stakes Computer-Adaptive Testing (CAT), such as CAT procedures that might be used for quick diagnostic or screening exams. We used a basic CAT algorithm without content balancing and exposure control restrictions to reflect low stakes testing scenarios. We examined…
Descriptors: Item Banks, Adaptive Testing, Computer Assisted Testing, Achievement
Pan, Yiqin; Livne, Oren; Wollack, James A.; Sinharay, Sandip – Educational Measurement: Issues and Practice, 2023
In computerized adaptive testing, overexposure of items in the bank is a serious problem and might result in item compromise. We develop an item selection algorithm that utilizes the entire bank well and reduces the overexposure of items. The algorithm is based on collaborative filtering and selects an item in two stages. In the first stage, a set…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Algorithms
Yunting Liu; Shreya Bhandari; Zachary A. Pardos – British Journal of Educational Technology, 2025
Effective educational measurement relies heavily on the curation of well-designed item pools. However, item calibration is time consuming and costly, requiring a sufficient number of respondents to estimate the psychometric properties of items. In this study, we explore the potential of six different large language models (LLMs; GPT-3.5, GPT-4,…
Descriptors: Artificial Intelligence, Test Items, Psychometrics, Educational Assessment
Eren Can Aybek; Serkan Arikan; Günes Ertas – International Journal of Assessment Tools in Education, 2024
When it is required to estimate item parameters of a large item bank, Multiple Matrix Sampling (MMS) design provides an efficient way while minimizing the test burden on students. The current study exemplifies how to calibrate a large item pool using MMS design for various purposes, such as developing a CAT administration. The purpose of the…
Descriptors: Elementary School Mathematics, Elementary School Students, Grade 4, Item Banks
Ayfer Sayin; Mark J. Gierl – International Journal of Assessment Tools in Education, 2023
Developments in the field of education have significantly affected test development processes, and computer-based test applications have been started in many institutions. In our country, research on the application of measurement and evaluation tools in the computer environment for use with distance education is gaining momentum. A large pool of…
Descriptors: Turkish, Literature, Test Items, Item Banks
Hwanggyu Lim; Kyung T. Han – Educational Measurement: Issues and Practice, 2024
Computerized adaptive testing (CAT) has gained deserved popularity in the administration of educational and professional assessments, but continues to face test security challenges. To ensure sustained quality assurance and testing integrity, it is imperative to establish and maintain multiple stable item pools that are consistent in terms of…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Item Banks
Gys-Walt Van Egdom; Iris Schrijver; Heidi Verplaetse; Winibert Segers – Interpreter and Translator Trainer, 2024
This article explores the impact of collaboration on target text quality in translator training. By comparing team translations with those by individual peers, and analysing the highest and lowest scoring teams, the authors aimed to understand the impact of collaboration on quality. The comparison indicates that translations in a skills lab…
Descriptors: Foreign Countries, College Students, Translation, Cooperative Learning
Ersen, Rabia Karatoprak; Lee, Won-Chan – Journal of Educational Measurement, 2023
The purpose of this study was to compare calibration and linking methods for placing pretest item parameter estimates on the item pool scale in a 1-3 computerized multistage adaptive testing design in terms of item parameter recovery. Two models were used: embedded-section, in which pretest items were administered within a separate module, and…
Descriptors: Pretesting, Test Items, Computer Assisted Testing, Adaptive Testing
Development of a High-Accuracy and Effective Online Calibration Method in CD-CAT Based on Gini Index
Tan, Qingrong; Cai, Yan; Luo, Fen; Tu, Dongbo – Journal of Educational and Behavioral Statistics, 2023
To improve the calibration accuracy and calibration efficiency of cognitive diagnostic computerized adaptive testing (CD-CAT) for new items and, ultimately, contribute to the widespread application of CD-CAT in practice, the current article proposed a Gini-based online calibration method that can simultaneously calibrate the Q-matrix and item…
Descriptors: Cognitive Tests, Computer Assisted Testing, Adaptive Testing, Accuracy
Bayesian Logistic Regression: A New Method to Calibrate Pretest Items in Multistage Adaptive Testing
TsungHan Ho – Applied Measurement in Education, 2023
An operational multistage adaptive test (MST) requires the development of a large item bank and the effort to continuously replenish the item bank due to concerns about test security and validity over the long term. New items should be pretested and linked to the item bank before being used operationally. The linking item volume fluctuations in…
Descriptors: Bayesian Statistics, Regression (Statistics), Test Items, Pretesting
Anela Hrnjicic; Adis Alihodžic – International Electronic Journal of Mathematics Education, 2024
Understanding the concepts related to real function is essential in learning mathematics. To determine how students understand these concepts, it is necessary to have an appropriate measurement tool. In this paper, we have created a web application using 32 items from conceptual understanding of real functions (CURF) item bank. We conducted a…
Descriptors: Mathematical Concepts, College Freshmen, Foreign Countries, Computer Assisted Testing
Demir, Seda – Journal of Educational Technology and Online Learning, 2022
The purpose of this research was to evaluate the effect of item pool and selection algorithms on computerized classification testing (CCT) performance in terms of some classification evaluation metrics. For this purpose, 1000 examinees' response patterns using the R package were generated and eight item pools with 150, 300, 450, and 600 items…
Descriptors: Test Items, Item Banks, Mathematics, Computer Assisted Testing
Stefanie A. Wind; Beyza Aksu-Dunya – Applied Measurement in Education, 2024
Careless responding is a pervasive concern in research using affective surveys. Although researchers have considered various methods for identifying careless responses, studies are limited that consider the utility of these methods in the context of computer adaptive testing (CAT) for affective scales. Using a simulation study informed by recent…
Descriptors: Response Style (Tests), Computer Assisted Testing, Adaptive Testing, Affective Measures
van Sluis, Klaske E.; Passchier, Ellen; van Son, Rob J. J. H.; van der Molen, Lisette; Stuiver, Martijn; van den Brekel, Michiel W. M.; Van den Steen, Leen; Kalf, Johanna G.; van Nuffelen, Gwen – International Journal of Language & Communication Disorders, 2023
Background: Several conditions and diseases can result in speech problems that can have a negative impact on everyday functioning, referred to as communicative participation. Subjective problems with acquired speech problems are often assessed with the speech handicap index (SHI). To assess generic participation problems, the Utrecht Scale for…
Descriptors: Indo European Languages, Translation, Test Construction, Test Validity
Kreitchmann, Rodrigo S.; Sorrel, Miguel A.; Abad, Francisco J. – Educational and Psychological Measurement, 2023
Multidimensional forced-choice (FC) questionnaires have been consistently found to reduce the effects of socially desirable responding and faking in noncognitive assessments. Although FC has been considered problematic for providing ipsative scores under the classical test theory, item response theory (IRT) models enable the estimation of…
Descriptors: Measurement Techniques, Questionnaires, Social Desirability, Adaptive Testing