Publication Date
| In 2026 | 0 |
| Since 2025 | 4 |
| Since 2022 (last 5 years) | 9 |
| Since 2017 (last 10 years) | 26 |
| Since 2007 (last 20 years) | 73 |
Descriptor
| Computation | 82 |
| Difficulty Level | 82 |
| Test Items | 82 |
| Item Response Theory | 49 |
| Models | 19 |
| Comparative Analysis | 18 |
| Statistical Analysis | 18 |
| Accuracy | 12 |
| Mathematics Tests | 12 |
| Sample Size | 12 |
| Simulation | 12 |
| More ▼ | |
Source
Author
| Finch, Holmes | 3 |
| Guo, Hongwen | 3 |
| He, Wei | 3 |
| Ketterlin-Geller, Leanne R. | 3 |
| Liu, Kimy | 3 |
| Tindal, Gerald | 3 |
| Jiao, Hong | 2 |
| Luke G. Eglington | 2 |
| Matlock, Ki Lynn | 2 |
| Michaelides, Michalis P. | 2 |
| Nelson, Gena | 2 |
| More ▼ | |
Publication Type
| Journal Articles | 63 |
| Reports - Research | 60 |
| Dissertations/Theses -… | 8 |
| Reports - Descriptive | 8 |
| Reports - Evaluative | 6 |
| Numerical/Quantitative Data | 3 |
| Speeches/Meeting Papers | 3 |
| Tests/Questionnaires | 3 |
Education Level
| Elementary Education | 10 |
| Middle Schools | 8 |
| Secondary Education | 8 |
| Higher Education | 7 |
| Postsecondary Education | 7 |
| Grade 3 | 6 |
| Grade 8 | 6 |
| Grade 4 | 5 |
| Grade 5 | 5 |
| Junior High Schools | 5 |
| Elementary Secondary Education | 4 |
| More ▼ | |
Audience
Location
| Turkey | 3 |
| Indonesia | 2 |
| Belgium | 1 |
| Florida | 1 |
| Germany | 1 |
| India | 1 |
| Malaysia | 1 |
| New York | 1 |
| Oregon | 1 |
| Saudi Arabia | 1 |
| United Kingdom | 1 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
| Wide Range Achievement Test | 2 |
| Comprehensive Tests of Basic… | 1 |
| General Aptitude Test Battery | 1 |
| Graduate Record Examinations | 1 |
| Measures of Academic Progress | 1 |
| National Assessment of… | 1 |
What Works Clearinghouse Rating
Leonidas Zotos; Hedderik van Rijn; Malvina Nissim – International Educational Data Mining Society, 2025
In an educational setting, an estimate of the difficulty of Multiple-Choice Questions (MCQs), a commonly used strategy to assess learning progress, constitutes very useful information for both teachers and students. Since human assessment is costly from multiple points of view, automatic approaches to MCQ item difficulty estimation are…
Descriptors: Multiple Choice Tests, Test Items, Difficulty Level, Artificial Intelligence
Sarah Alahmadi; Christine E. DeMars – Journal of Educational Measurement, 2025
Inadequate test-taking effort poses a significant challenge, particularly when low-stakes test results inform high-stakes policy and psychometric decisions. We examined how rapid guessing (RG), a common form of low test-taking effort, biases item parameter estimates, particularly the discrimination and difficulty parameters. Previous research…
Descriptors: Guessing (Tests), Computation, Statistical Bias, Test Items
Aiman Mohammad Freihat; Omar Saleh Bani Yassin – Educational Process: International Journal, 2025
Background/purpose: This study aimed to reveal the accuracy of estimation of multiple-choice test items parameters following the models of the item-response theory in measurement. Materials/methods: The researchers depended on the measurement accuracy indicators, which express the absolute difference between the estimated and actual values of the…
Descriptors: Accuracy, Computation, Multiple Choice Tests, Test Items
Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2023
Traditional estimators of reliability such as coefficients alpha, theta, omega, and rho (maximal reliability) are prone to give radical underestimates of reliability for the tests common when testing educational achievement. These tests are often structured by widely deviating item difficulties. This is a typical pattern where the traditional…
Descriptors: Test Reliability, Achievement Tests, Computation, Test Items
Ali Orhan; Inan Tekin; Sedat Sen – International Journal of Assessment Tools in Education, 2025
In this study, it was aimed to translate and adapt the Computational Thinking Multidimensional Test (CTMT) developed by Kang et al. (2023) into Turkish and to investigate its psychometric qualities with Turkish university students. Following the translation procedures of the CTMT with 12 multiple-choice questions developed based on real-life…
Descriptors: Cognitive Tests, Thinking Skills, Computation, Test Validity
Peer reviewedAndreea Dutulescu; Stefan Ruseti; Mihai Dascalu; Danielle S. McNamara – Grantee Submission, 2024
Assessing the difficulty of reading comprehension questions is crucial to educational methodologies and language understanding technologies. Traditional methods of assessing question difficulty rely frequently on human judgments or shallow metrics, often failing to accurately capture the intricate cognitive demands of answering a question. This…
Descriptors: Difficulty Level, Reading Tests, Test Items, Reading Comprehension
Sample Size and Item Parameter Estimation Precision When Utilizing the Masters' Partial Credit Model
Custer, Michael; Kim, Jongpil – Online Submission, 2023
This study utilizes an analysis of diminishing returns to examine the relationship between sample size and item parameter estimation precision when utilizing the Masters' Partial Credit Model for polytomous items. Item data from the standardization of the Batelle Developmental Inventory, 3rd Edition were used. Each item was scored with a…
Descriptors: Sample Size, Item Response Theory, Test Items, Computation
Tang, Xiaodan; Karabatsos, George; Chen, Haiqin – Applied Measurement in Education, 2020
In applications of item response theory (IRT) models, it is known that empirical violations of the local independence (LI) assumption can significantly bias parameter estimates. To address this issue, we propose a threshold-autoregressive item response theory (TAR-IRT) model that additionally accounts for order dependence among the item responses…
Descriptors: Item Response Theory, Test Items, Models, Computation
DeCarlo, Lawrence T. – Journal of Educational Measurement, 2023
A conceptualization of multiple-choice exams in terms of signal detection theory (SDT) leads to simple measures of item difficulty and item discrimination that are closely related to, but also distinct from, those used in classical item analysis (CIA). The theory defines a "true split," depending on whether or not examinees know an item,…
Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Test Wiseness
Derek Sauder – ProQuest LLC, 2020
The Rasch model is commonly used to calibrate multiple choice items. However, the sample sizes needed to estimate the Rasch model can be difficult to attain (e.g., consider a small testing company trying to pretest new items). With small sample sizes, auxiliary information besides the item responses may improve estimation of the item parameters.…
Descriptors: Item Response Theory, Sample Size, Computation, Test Length
Lozano, José H.; Revuelta, Javier – Applied Measurement in Education, 2021
The present study proposes a Bayesian approach for estimating and testing the operation-specific learning model, a variant of the linear logistic test model that allows for the measurement of the learning that occurs during a test as a result of the repeated use of the operations involved in the items. The advantages of using a Bayesian framework…
Descriptors: Bayesian Statistics, Computation, Learning, Testing
Bjermo, Jonas; Miller, Frank – Applied Measurement in Education, 2021
In recent years, the interest in measuring growth in student ability in various subjects between different grades in school has increased. Therefore, good precision in the estimated growth is of importance. This paper aims to compare estimation methods and test designs when it comes to precision and bias of the estimated growth of mean ability…
Descriptors: Scaling, Ability, Computation, Test Items
Munawarah; Thalhah, Siti Zuhaerah; Angriani, Andi Dian; Nur, Fitriani; Kusumayanti, Andi – Mathematics Teaching Research Journal, 2021
The increase in the need for critical and analytical thinking among students to boost their confidence in dealing with complex and difficult problems has led to the development of computational skills. Therefore, this study aims to develop an instrument test for computational thinking (CT) skills in the mathematics-based RME (Realistic Mathematics…
Descriptors: Test Construction, Mathematics Tests, Computation, Thinking Skills
Luke G. Eglington; Philip I. Pavlik – Grantee Submission, 2020
Decades of research has shown that spacing practice trials over time can improve later memory, but there are few concrete recommendations concerning how to optimally space practice. We show that existing recommendations are inherently suboptimal due to their insensitivity to time costs and individual- and item-level differences. We introduce an…
Descriptors: Scheduling, Drills (Practice), Memory, Testing
Luke G. Eglington; Philip I. Pavlik Jr. – npj Science of Learning, 2020
Decades of research has shown that spacing practice trials over time can improve later memory, but there are few concrete recommendations concerning how to optimally space practice. We show that existing recommendations are inherently suboptimal due to their insensitivity to time costs and individual- and item-level differences. We introduce an…
Descriptors: Scheduling, Drills (Practice), Memory, Testing

Direct link
