Publication Date
In 2025 | 3 |
Since 2024 | 7 |
Since 2021 (last 5 years) | 12 |
Since 2016 (last 10 years) | 21 |
Since 2006 (last 20 years) | 35 |
Descriptor
Comparative Analysis | 51 |
Item Analysis | 51 |
Simulation | 42 |
Test Items | 30 |
Item Response Theory | 21 |
Computer Assisted Testing | 16 |
Models | 13 |
Evaluation Methods | 10 |
Adaptive Testing | 9 |
Computer Simulation | 9 |
Correlation | 9 |
More ▼ |
Source
Author
Ishii, Takatoshi | 2 |
Reckase, Mark D. | 2 |
Ueno, Maomi | 2 |
Wang, Wen-Chung | 2 |
Weiss, David J. | 2 |
Abad, Francisco Jose | 1 |
Abulela, Mohammed A. A. | 1 |
Allan S. Cohen | 1 |
Alsma, Jelmer | 1 |
Andreas Gold | 1 |
Ayan, Cansu | 1 |
More ▼ |
Publication Type
Reports - Research | 42 |
Journal Articles | 39 |
Reports - Evaluative | 5 |
Speeches/Meeting Papers | 4 |
Reports - Descriptive | 3 |
Information Analyses | 2 |
Tests/Questionnaires | 1 |
Education Level
Secondary Education | 3 |
Elementary Secondary Education | 1 |
Grade 12 | 1 |
High Schools | 1 |
Higher Education | 1 |
Postsecondary Education | 1 |
Audience
Researchers | 3 |
Practitioners | 1 |
Location
Japan | 1 |
Netherlands | 1 |
Turkey | 1 |
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 1 |
National Longitudinal Study… | 1 |
Program for International… | 1 |
Trends in International… | 1 |
Wechsler Adult Intelligence… | 1 |
What Works Clearinghouse Rating
Sohee Kim; Ki Lynn Cole – International Journal of Testing, 2025
This study conducted a comprehensive comparison of Item Response Theory (IRT) linking methods applied to a bifactor model, examining their performance on both multiple choice (MC) and mixed format tests within the common item nonequivalent group design framework. Four distinct multidimensional IRT linking approaches were explored, consisting of…
Descriptors: Item Response Theory, Comparative Analysis, Models, Item Analysis
Karl Schweizer; Andreas Gold; Dorothea Krampen; Stefan Troche – Educational and Psychological Measurement, 2024
Conceptualizing two-variable disturbances preventing good model fit in confirmatory factor analysis as item-level method effects instead of correlated residuals avoids violating the principle that residual variation is unique for each item. The possibility of representing such a disturbance by a method factor of a bifactor measurement model was…
Descriptors: Correlation, Factor Analysis, Measurement Techniques, Item Analysis
Kazuhiro Yamaguchi – Journal of Educational and Behavioral Statistics, 2025
This study proposes a Bayesian method for diagnostic classification models (DCMs) for a partially known Q-matrix setting between exploratory and confirmatory DCMs. This Q-matrix setting is practical and useful because test experts have pre-knowledge of the Q-matrix but cannot readily specify it completely. The proposed method employs priors for…
Descriptors: Models, Classification, Bayesian Statistics, Evaluation Methods
Zsuzsa Bakk – Structural Equation Modeling: A Multidisciplinary Journal, 2024
A standard assumption of latent class (LC) analysis is conditional independence, that is the items of the LC are independent of the covariates given the LCs. Several approaches have been proposed for identifying violations of this assumption. The recently proposed likelihood ratio approach is compared to residual statistics (bivariate residuals…
Descriptors: Goodness of Fit, Error of Measurement, Comparative Analysis, Models
Eray Selçuk; Ergül Demir – International Journal of Assessment Tools in Education, 2024
This research aims to compare the ability and item parameter estimations of Item Response Theory according to Maximum likelihood and Bayesian approaches in different Monte Carlo simulation conditions. For this purpose, depending on the changes in the priori distribution type, sample size, test length, and logistics model, the ability and item…
Descriptors: Item Response Theory, Item Analysis, Test Items, Simulation
Finch, Holmes – Applied Measurement in Education, 2022
Much research has been devoted to identification of differential item functioning (DIF), which occurs when the item responses for individuals from two groups differ after they are conditioned on the latent trait being measured by the scale. There has been less work examining differential step functioning (DSF), which is present for polytomous…
Descriptors: Comparative Analysis, Item Response Theory, Item Analysis, Simulation
Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024
Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…
Descriptors: Semantics, Educational Assessment, Evaluators, Reliability
Sahin Kursad, Merve; Cokluk Bokeoglu, Omay; Cikrikci, Rahime Nukhet – International Journal of Assessment Tools in Education, 2022
Item parameter drift (IPD) is the systematic differentiation of parameter values of items over time due to various reasons. If it occurs in computer adaptive tests (CAT), it causes errors in the estimation of item and ability parameters. Identification of the underlying conditions of this situation in CAT is important for estimating item and…
Descriptors: Item Analysis, Computer Assisted Testing, Test Items, Error of Measurement
Fuchimoto, Kazuma; Ishii, Takatoshi; Ueno, Maomi – IEEE Transactions on Learning Technologies, 2022
Educational assessments often require uniform test forms, for which each test form has equivalent measurement accuracy but with a different set of items. For uniform test assembly, an important issue is the increase of the number of assembled uniform tests. Although many automatic uniform test assembly methods exist, the maximum clique algorithm…
Descriptors: Simulation, Efficiency, Test Items, Educational Assessment
Bayesian Adaptive Lasso for the Detection of Differential Item Functioning in Graded Response Models
Na Shan; Ping-Feng Xu – Journal of Educational and Behavioral Statistics, 2025
The detection of differential item functioning (DIF) is important in psychological and behavioral sciences. Standard DIF detection methods perform an item-by-item test iteratively, often assuming that all items except the one under investigation are DIF-free. This article proposes a Bayesian adaptive Lasso method to detect DIF in graded response…
Descriptors: Bayesian Statistics, Item Response Theory, Adolescents, Longitudinal Studies
Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022
Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…
Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations
Zhang, Zhonghua; Zhao, Mingren – Journal of Educational Measurement, 2019
The present study evaluated the multiple imputation method, a procedure that is similar to the one suggested by Li and Lissitz (2004), and compared the performance of this method with that of the bootstrap method and the delta method in obtaining the standard errors for the estimates of the parameter scale transformation coefficients in item…
Descriptors: Item Response Theory, Error Patterns, Item Analysis, Simulation
Feuerstahler, Leah; Wilson, Mark – Journal of Educational Measurement, 2019
Scores estimated from multidimensional item response theory (IRT) models are not necessarily comparable across dimensions. In this article, the concept of aligned dimensions is formalized in the context of Rasch models, and two methods are described--delta dimensional alignment (DDA) and logistic regression alignment (LRA)--to transform estimated…
Descriptors: Item Response Theory, Models, Scores, Comparative Analysis
Yesiltas, Gonca; Paek, Insu – Educational and Psychological Measurement, 2020
A log-linear model (LLM) is a well-known statistical method to examine the relationship among categorical variables. This study investigated the performance of LLM in detecting differential item functioning (DIF) for polytomously scored items via simulations where various sample sizes, ability mean differences (impact), and DIF types were…
Descriptors: Simulation, Sample Size, Item Analysis, Scores
Yurtçu, Meltem; Güzeller, Cem Oktay – International Journal of Assessment Tools in Education, 2018
In this study purposes to indicate the effect of the number of DIF items and the distribution of DIF items in these forms, which be equalized on equating error. Mean-mean, mean-standard deviation, Haebara and Stocking-Lord Methods used in common item design equal groups as equalization methods. The study included six different simulation…
Descriptors: Error Patterns, Test Items, Item Analysis, Simulation