ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	11
Since 2007 (last 20 years)	25

Descriptor

Sample Size	28
Statistical Analysis	28
Test Length	28
Item Response Theory	19
Test Items	12
Models	8
Comparative Analysis	7
Test Bias	7
Computation	6
Correlation	6
Goodness of Fit	6
Accuracy	5
Error of Measurement	4
Nonparametric Statistics	4
Simulation	4
Ability	3
Computer Software	3
Differences	3
Difficulty Level	3
Equated Scores	3
Error Patterns	3
Foreign Countries	3
Monte Carlo Methods	3
Achievement Tests	2
Classification	2
More ▼

Source

Educational and Psychological…	13
Applied Measurement in…	3
Applied Psychological…	2
Educational Sciences: Theory…	2
Measurement:…	2
ACT, Inc.	1
International Journal of…	1
Journal of Experimental…	1
ProQuest LLC	1

Publication Type

Journal Articles	24
Reports - Research	24
Reports - Evaluative	3
Speeches/Meeting Papers	2
Dissertations/Theses -…	1

Education Level

Higher Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Colombia	1
Indonesia	1
Jordan	1
Peru	1
Qatar	1
Taiwan	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 28 results Save | Export

The Comparison of Estimation Methods for the Four-Parameter Logistic Item Response Theory Model

Peer reviewed

Direct link

Kalkan, Ömür Kaya – Measurement: Interdisciplinary Research and Perspectives, 2022

The four-parameter logistic (4PL) Item Response Theory (IRT) model has recently been reconsidered in the literature due to the advances in the statistical modeling software and the recent developments in the estimation of the 4PL IRT model parameters. The current simulation study evaluated the performance of expectation-maximization (EM),…

Descriptors: Comparative Analysis, Sample Size, Test Length, Algorithms

Performance of the S-X[superscript 2] Statistic for the Multidimensional Graded Response Model

Peer reviewed

Direct link

Su, Shiyang; Wang, Chun; Weiss, David J. – Educational and Psychological Measurement, 2021

S-X[superscript 2] is a popular item fit index that is available in commercial software packages such as "flex"MIRT. However, no research has systematically examined the performance of S-X[superscript 2] for detecting item misfit within the context of the multidimensional graded response model (MGRM). The primary goal of this study was…

Descriptors: Statistics, Goodness of Fit, Test Items, Models

Applying a Multiple Comparison Control to IRT Item-Fit Testing

Peer reviewed

Direct link

Sauder, Derek; DeMars, Christine – Applied Measurement in Education, 2020

We used simulation techniques to assess the item-level and familywise Type I error control and power of an IRT item-fit statistic, the "S-X"[superscript 2]. Previous research indicated that the "S-X"[superscript 2] has good Type I error control and decent power, but no previous research examined familywise Type I error control.…

Descriptors: Item Response Theory, Test Items, Sample Size, Test Length

Subscore Equating and Profile Reporting

Peer reviewed

Direct link

Lim, Euijin; Lee, Won-Chan – Applied Measurement in Education, 2020

The purpose of this study is to address the necessity of subscore equating and to evaluate the performance of various equating methods for subtests. Assuming the random groups design and number-correct scoring, this paper analyzed real data and simulated data with four study factors including test dimensionality, subtest length, form difference in…

Descriptors: Equated Scores, Test Length, Test Format, Difficulty Level

The Performance of the Semigeneralized Partial Credit Model for Handling Item-Level Missingness

Peer reviewed

Direct link

Zhou, Sherry; Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2020

The semi-generalized partial credit model (Semi-GPCM) has been proposed as a unidimensional modeling method for handling not applicable scale responses and neutral scale responses, and it has been suggested that the model may be of use in handling missing data in scale items. The purpose of this study is to evaluate the ability of the…

Descriptors: Models, Statistical Analysis, Response Style (Tests), Test Items

Estimation of Mixture Rasch Models from Skewed Latent Ability Distributions

Peer reviewed

Direct link

Karadavut, Tugba; Cohen, Allan S.; Kim, Seock-Ho – Measurement: Interdisciplinary Research and Perspectives, 2020

Mixture Rasch (MixRasch) models conventionally assume normal distributions for latent ability. Previous research has shown that the assumption of normality is often unmet in educational and psychological measurement. When normality is assumed, asymmetry in the actual latent ability distribution has been shown to result in extraction of spurious…

Descriptors: Item Response Theory, Ability, Statistical Distributions, Sample Size

Determination of Type I Error Rates and Power of Answer Copying Indices under Various Conditions

Peer reviewed
PDF on ERIC

Download full text

Yormaz, Seha; Sünbül, Önder – Educational Sciences: Theory and Practice, 2017

This study aims to determine the Type I error rates and power of S[subscript 1] , S[subscript 2] indices and kappa statistic at detecting copying on multiple-choice tests under various conditions. It also aims to determine how copying groups are created in order to calculate how kappa statistics affect Type I error rates and power. In this study,…

Descriptors: Statistical Analysis, Cheating, Multiple Choice Tests, Sample Size

Evaluating the Accuracy of the Empirical Item Characteristic Curve Preequating Method in the Presence of Test Speededness

Peer reviewed

Direct link

Qiu, Yuxi; Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2019

This study aimed to assess the accuracy of the empirical item characteristic curve (EICC) preequating method given the presence of test speededness. The simulation design of this study considered the proportion of speededness, speededness point, speededness rate, proportion of missing on speeded items, sample size, and test length. After crossing…

Descriptors: Accuracy, Equated Scores, Test Items, Nonparametric Statistics

Multidimensional Extension of Multiple Indicators Multiple Causes Models to Detect DIF

Peer reviewed

Direct link

Lee, Soo; Bulut, Okan; Suh, Youngsuk – Educational and Psychological Measurement, 2017

A number of studies have found multiple indicators multiple causes (MIMIC) models to be an effective tool in detecting uniform differential item functioning (DIF) for individual items and item bundles. A recently developed MIMIC-interaction model is capable of detecting both uniform and nonuniform DIF in the unidimensional item response theory…

Descriptors: Test Bias, Test Items, Models, Item Response Theory

Mixture IRT Model with a Higher-Order Structure for Latent Traits

Peer reviewed

Direct link

Huang, Hung-Yu – Educational and Psychological Measurement, 2017

Mixture item response theory (IRT) models have been suggested as an efficient method of detecting the different response patterns derived from latent classes when developing a test. In testing situations, multiple latent traits measured by a battery of tests can exhibit a higher-order structure, and mixtures of latent classes may occur on…

Descriptors: Item Response Theory, Models, Bayesian Statistics, Computation

Dimensionality in Compensatory MIRT When Complex Structure Exists: Evaluation of DETECT and NOHARM

Peer reviewed

Direct link

Svetina, Dubravka; Levy, Roy – Journal of Experimental Education, 2016

This study investigated the effect of complex structure on dimensionality assessment in compensatory multidimensional item response models using DETECT- and NOHARM-based methods. The performance was evaluated via the accuracy of identifying the correct number of dimensions and the ability to accurately recover item groupings using a simple…

Descriptors: Item Response Theory, Accuracy, Correlation, Sample Size

Evaluating the Impact of Guessing and Its Interactions with Other Test Characteristics on Confidence Interval Procedures for Coefficient Alpha

Peer reviewed

Direct link

Paek, Insu – Educational and Psychological Measurement, 2016

The effect of guessing on the point estimate of coefficient alpha has been studied in the literature, but the impact of guessing and its interactions with other test characteristics on the interval estimators for coefficient alpha has not been fully investigated. This study examined the impact of guessing and its interactions with other test…

Descriptors: Guessing (Tests), Computation, Statistical Analysis, Test Length

The Matching Criterion Purification for Differential Item Functioning Analyses in a Large-Scale Assessment

Peer reviewed

Direct link

Lee, HyeSun; Geisinger, Kurt F. – Educational and Psychological Measurement, 2016

The current study investigated the impact of matching criterion purification on the accuracy of differential item functioning (DIF) detection in large-scale assessments. The three matching approaches for DIF analyses (block-level matching, pooled booklet matching, and equated pooled booklet matching) were employed with the Mantel-Haenszel…

Descriptors: Test Bias, Measurement, Accuracy, Statistical Analysis

A Monte Carlo Study of an Iterative Wald Test Procedure for DIF Analysis

Peer reviewed

Direct link

Cao, Mengyang; Tay, Louis; Liu, Yaowu – Educational and Psychological Measurement, 2017

This study examined the performance of a proposed iterative Wald approach for detecting differential item functioning (DIF) between two groups when preknowledge of anchor items is absent. The iterative approach utilizes the Wald-2 approach to identify anchor items and then iteratively tests for DIF items with the Wald-1 approach. Monte Carlo…

Descriptors: Monte Carlo Methods, Test Items, Test Bias, Error of Measurement

Item Response Theory with Covariates (IRT-C): Assessing Item Recovery and Differential Item Functioning for the Three-Parameter Logistic Model

Peer reviewed

Direct link

Tay, Louis; Huang, Qiming; Vermunt, Jeroen K. – Educational and Psychological Measurement, 2016

In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To…

Descriptors: Item Response Theory, Test Bias, Simulation, College Entrance Examinations

Previous Page | Next Page »

Pages: 1 | 2

Cohen, Allan S.	2
Huggins-Manley, Anne Corinne	2
Svetina, Dubravka	2
Tay, Louis	2
Abad, Francisco J.	1
Arsan, Nihan	1
Atalay Kabasakal, Kübra	1
Bulut, Okan	1
Cao, Mengyang	1
Chen, Troy T.	1
De Champlain, Andre F.	1
DeMars, Christine	1
DeMars, Christine E.	1
Geisinger, Kurt F.	1
Gessaroli, Marc E.	1
Gök, Bilge	1
Huang, Hung-Yu	1
Huang, Qiming	1
Hutten, Leah R.	1
Kalkan, Ömür Kaya	1
Kang, Taehoon	1
Karadavut, Tugba	1
Kelecioglu, Hülya	1
Kim, Seock-Ho	1
Lee, HyeSun	1
More ▼