Publication Date
| In 2026 | 0 |
| Since 2025 | 2 |
| Since 2022 (last 5 years) | 48 |
| Since 2017 (last 10 years) | 114 |
| Since 2007 (last 20 years) | 235 |
Descriptor
| Classification | 332 |
| Test Items | 332 |
| Item Response Theory | 88 |
| Test Construction | 78 |
| Foreign Countries | 69 |
| Models | 62 |
| Item Analysis | 59 |
| Accuracy | 58 |
| Difficulty Level | 52 |
| Diagnostic Tests | 44 |
| Comparative Analysis | 42 |
| More ▼ | |
Source
Author
| Dorans, Neil J. | 4 |
| Haladyna, Thomas M. | 4 |
| Meijer, Rob R. | 4 |
| Spray, Judith A. | 4 |
| Chen, Yi-Hsin | 3 |
| Downing, Steven M. | 3 |
| Gelman, Susan A. | 3 |
| Gierl, Mark J. | 3 |
| Hambleton, Ronald K. | 3 |
| Jiao, Hong | 3 |
| Ravand, Hamdollah | 3 |
| More ▼ | |
Publication Type
Education Level
Audience
| Practitioners | 4 |
| Researchers | 4 |
| Teachers | 4 |
| Policymakers | 1 |
Location
| Turkey | 7 |
| China | 5 |
| Taiwan | 4 |
| Indonesia | 3 |
| Israel | 3 |
| Massachusetts | 3 |
| New York | 3 |
| Africa | 2 |
| Bangladesh | 2 |
| Germany | 2 |
| Iran | 2 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 3 |
| Individuals with Disabilities… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Daniel P. Jurich; Matthew J. Madison – Educational Assessment, 2023
Diagnostic classification models (DCMs) are psychometric models that provide probabilistic classifications of examinees on a set of discrete latent attributes. When analyzing or constructing assessments scored by DCMs, understanding how each item influences attribute classifications can clarify the meaning of the measured constructs, facilitate…
Descriptors: Test Items, Models, Classification, Influences
Ting Wang; Keith Stelter; Thomas O’Neill; Nathaniel Hendrix; Andrew Bazemore; Kevin Rode; Warren P. Newton – Journal of Applied Testing Technology, 2025
Precise item categorisation is essential in aligning exam questions with content domains outlined in assessment blueprints. Traditional methods, such as manual classification or supervised machine learning, are often time-consuming, error-prone, or limited by the need for large training datasets. This study presents a novel approach using…
Descriptors: Test Items, Automation, Classification, Artificial Intelligence
He, Dan – ProQuest LLC, 2023
This dissertation examines the effectiveness of machine learning algorithms and feature engineering techniques for analyzing process data and predicting test performance. The study compares three classification approaches and identifies item-specific process features that are highly predictive of student performance. The findings suggest that…
Descriptors: Artificial Intelligence, Data Analysis, Algorithms, Classification
Hess, Jessica – ProQuest LLC, 2023
This study was conducted to further research into the impact of student-group item parameter drift (SIPD) --referred to as subpopulation item parameter drift in previous research-- on ability estimates and proficiency classification accuracy when occurring in the discrimination parameter of a 2-PL item response theory (IRT) model. Using Monte…
Descriptors: Test Items, Groups, Ability, Item Response Theory
Su, Kun; Henson, Robert A. – Journal of Educational and Behavioral Statistics, 2023
This article provides a process to carefully evaluate the suitability of a content domain for which diagnostic classification models (DCMs) could be applicable and then optimized steps for constructing a test blueprint for applying DCMs and a real-life example illustrating this process. The content domains were carefully evaluated using a set of…
Descriptors: Classification, Models, Science Tests, Physics
Weese, James D.; Turner, Ronna C.; Ames, Allison; Crawford, Brandon; Liang, Xinya – Educational and Psychological Measurement, 2022
A simulation study was conducted to investigate the heuristics of the SIBTEST procedure and how it compares with ETS classification guidelines used with the Mantel-Haenszel procedure. Prior heuristics have been used for nearly 25 years, but they are based on a simulation study that was restricted due to computer limitations and that modeled item…
Descriptors: Test Bias, Heuristics, Classification, Statistical Analysis
Rios, Joseph – Applied Measurement in Education, 2022
To mitigate the deleterious effects of rapid guessing (RG) on ability estimates, several rescoring procedures have been proposed. Underlying many of these procedures is the assumption that RG is accurately identified. At present, there have been minimal investigations examining the utility of rescoring approaches when RG is misclassified, and…
Descriptors: Accuracy, Guessing (Tests), Scoring, Classification
Jing Ma – ProQuest LLC, 2024
This study investigated the impact of scoring polytomous items later on measurement precision, classification accuracy, and test security in mixed-format adaptive testing. Utilizing the shadow test approach, a simulation study was conducted across various test designs, lengths, number and location of polytomous item. Results showed that while…
Descriptors: Scoring, Adaptive Testing, Test Items, Classification
Demir, Seda – Journal of Educational Technology and Online Learning, 2022
The purpose of this research was to evaluate the effect of item pool and selection algorithms on computerized classification testing (CCT) performance in terms of some classification evaluation metrics. For this purpose, 1000 examinees' response patterns using the R package were generated and eight item pools with 150, 300, 450, and 600 items…
Descriptors: Test Items, Item Banks, Mathematics, Computer Assisted Testing
Henninger, Mirka; Debelak, Rudolf; Strobl, Carolin – Educational and Psychological Measurement, 2023
To detect differential item functioning (DIF), Rasch trees search for optimal split-points in covariates and identify subgroups of respondents in a data-driven way. To determine whether and in which covariate a split should be performed, Rasch trees use statistical significance tests. Consequently, Rasch trees are more likely to label small DIF…
Descriptors: Item Response Theory, Test Items, Effect Size, Statistical Significance
Britt Hadar; Maayan Katzir; Sephi Pumpian; Tzur Karelitz; Nira Liberman – npj Science of Learning, 2023
Performance on standardized academic aptitude tests (AAT) can determine important life outcomes. However, it is not clear whether and which aspects of the content of test questions affect performance. We examined the effect of psychological distance embedded in test questions. In Study 1 (N = 41,209), we classified the content of existing AAT…
Descriptors: Academic Aptitude, Thinking Skills, Aptitude Tests, Standardized Tests
Condor, Aubrey; Litster, Max; Pardos, Zachary – International Educational Data Mining Society, 2021
We explore how different components of an Automatic Short Answer Grading (ASAG) model affect the model's ability to generalize to questions outside of those used for training. For supervised automatic grading models, human ratings are primarily used as ground truth labels. Producing such ratings can be resource heavy, as subject matter experts…
Descriptors: Automation, Grading, Test Items, Generalization
Sebastian Moncaleano – ProQuest LLC, 2021
The growth of computer-based testing over the last two decades has motivated the creation of innovative item formats. It is often argued that technology-enhanced items (TEIs) provide better measurement of test-takers' knowledge, skills, and abilities by increasing the authenticity of tasks presented to test-takers (Sireci & Zenisky, 2006).…
Descriptors: Computer Assisted Testing, Test Format, Test Items, Classification
Robitzsch, Alexander – Journal of Intelligence, 2020
The last series of Raven's standard progressive matrices (SPM-LS) test was studied with respect to its psychometric properties in a series of recent papers. In this paper, the SPM-LS dataset is analyzed with regularized latent class models (RLCMs). For dichotomous item response data, an alternative estimation approach based on fused regularization…
Descriptors: Statistical Analysis, Classification, Intelligence Tests, Test Items
Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2020
This note raises caution that a finding of a marked pseudo-guessing parameter for an item within a three-parameter item response model could be spurious in a population with substantial unobserved heterogeneity. A numerical example is presented wherein each of two classes the two-parameter logistic model is used to generate the data on a…
Descriptors: Guessing (Tests), Item Response Theory, Test Items, Models

Peer reviewed
Direct link
