NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 28 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Kalkan, Ömür Kaya – Measurement: Interdisciplinary Research and Perspectives, 2022
The four-parameter logistic (4PL) Item Response Theory (IRT) model has recently been reconsidered in the literature due to the advances in the statistical modeling software and the recent developments in the estimation of the 4PL IRT model parameters. The current simulation study evaluated the performance of expectation-maximization (EM),…
Descriptors: Comparative Analysis, Sample Size, Test Length, Algorithms
Peer reviewed Peer reviewed
Direct linkDirect link
Su, Shiyang; Wang, Chun; Weiss, David J. – Educational and Psychological Measurement, 2021
S-X[superscript 2] is a popular item fit index that is available in commercial software packages such as "flex"MIRT. However, no research has systematically examined the performance of S-X[superscript 2] for detecting item misfit within the context of the multidimensional graded response model (MGRM). The primary goal of this study was…
Descriptors: Statistics, Goodness of Fit, Test Items, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Sauder, Derek; DeMars, Christine – Applied Measurement in Education, 2020
We used simulation techniques to assess the item-level and familywise Type I error control and power of an IRT item-fit statistic, the "S-X"[superscript 2]. Previous research indicated that the "S-X"[superscript 2] has good Type I error control and decent power, but no previous research examined familywise Type I error control.…
Descriptors: Item Response Theory, Test Items, Sample Size, Test Length
Peer reviewed Peer reviewed
Direct linkDirect link
Lim, Euijin; Lee, Won-Chan – Applied Measurement in Education, 2020
The purpose of this study is to address the necessity of subscore equating and to evaluate the performance of various equating methods for subtests. Assuming the random groups design and number-correct scoring, this paper analyzed real data and simulated data with four study factors including test dimensionality, subtest length, form difference in…
Descriptors: Equated Scores, Test Length, Test Format, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Zhou, Sherry; Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2020
The semi-generalized partial credit model (Semi-GPCM) has been proposed as a unidimensional modeling method for handling not applicable scale responses and neutral scale responses, and it has been suggested that the model may be of use in handling missing data in scale items. The purpose of this study is to evaluate the ability of the…
Descriptors: Models, Statistical Analysis, Response Style (Tests), Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Karadavut, Tugba; Cohen, Allan S.; Kim, Seock-Ho – Measurement: Interdisciplinary Research and Perspectives, 2020
Mixture Rasch (MixRasch) models conventionally assume normal distributions for latent ability. Previous research has shown that the assumption of normality is often unmet in educational and psychological measurement. When normality is assumed, asymmetry in the actual latent ability distribution has been shown to result in extraction of spurious…
Descriptors: Item Response Theory, Ability, Statistical Distributions, Sample Size
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Yormaz, Seha; Sünbül, Önder – Educational Sciences: Theory and Practice, 2017
This study aims to determine the Type I error rates and power of S[subscript 1] , S[subscript 2] indices and kappa statistic at detecting copying on multiple-choice tests under various conditions. It also aims to determine how copying groups are created in order to calculate how kappa statistics affect Type I error rates and power. In this study,…
Descriptors: Statistical Analysis, Cheating, Multiple Choice Tests, Sample Size
Peer reviewed Peer reviewed
Direct linkDirect link
Qiu, Yuxi; Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2019
This study aimed to assess the accuracy of the empirical item characteristic curve (EICC) preequating method given the presence of test speededness. The simulation design of this study considered the proportion of speededness, speededness point, speededness rate, proportion of missing on speeded items, sample size, and test length. After crossing…
Descriptors: Accuracy, Equated Scores, Test Items, Nonparametric Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Soo; Bulut, Okan; Suh, Youngsuk – Educational and Psychological Measurement, 2017
A number of studies have found multiple indicators multiple causes (MIMIC) models to be an effective tool in detecting uniform differential item functioning (DIF) for individual items and item bundles. A recently developed MIMIC-interaction model is capable of detecting both uniform and nonuniform DIF in the unidimensional item response theory…
Descriptors: Test Bias, Test Items, Models, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Huang, Hung-Yu – Educational and Psychological Measurement, 2017
Mixture item response theory (IRT) models have been suggested as an efficient method of detecting the different response patterns derived from latent classes when developing a test. In testing situations, multiple latent traits measured by a battery of tests can exhibit a higher-order structure, and mixtures of latent classes may occur on…
Descriptors: Item Response Theory, Models, Bayesian Statistics, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Svetina, Dubravka; Levy, Roy – Journal of Experimental Education, 2016
This study investigated the effect of complex structure on dimensionality assessment in compensatory multidimensional item response models using DETECT- and NOHARM-based methods. The performance was evaluated via the accuracy of identifying the correct number of dimensions and the ability to accurately recover item groupings using a simple…
Descriptors: Item Response Theory, Accuracy, Correlation, Sample Size
Peer reviewed Peer reviewed
Direct linkDirect link
Paek, Insu – Educational and Psychological Measurement, 2016
The effect of guessing on the point estimate of coefficient alpha has been studied in the literature, but the impact of guessing and its interactions with other test characteristics on the interval estimators for coefficient alpha has not been fully investigated. This study examined the impact of guessing and its interactions with other test…
Descriptors: Guessing (Tests), Computation, Statistical Analysis, Test Length
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, HyeSun; Geisinger, Kurt F. – Educational and Psychological Measurement, 2016
The current study investigated the impact of matching criterion purification on the accuracy of differential item functioning (DIF) detection in large-scale assessments. The three matching approaches for DIF analyses (block-level matching, pooled booklet matching, and equated pooled booklet matching) were employed with the Mantel-Haenszel…
Descriptors: Test Bias, Measurement, Accuracy, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Cao, Mengyang; Tay, Louis; Liu, Yaowu – Educational and Psychological Measurement, 2017
This study examined the performance of a proposed iterative Wald approach for detecting differential item functioning (DIF) between two groups when preknowledge of anchor items is absent. The iterative approach utilizes the Wald-2 approach to identify anchor items and then iteratively tests for DIF items with the Wald-1 approach. Monte Carlo…
Descriptors: Monte Carlo Methods, Test Items, Test Bias, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Tay, Louis; Huang, Qiming; Vermunt, Jeroen K. – Educational and Psychological Measurement, 2016
In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To…
Descriptors: Item Response Theory, Test Bias, Simulation, College Entrance Examinations
Previous Page | Next Page »
Pages: 1  |  2