ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	11
Since 2016 (last 10 years)	23
Since 2006 (last 20 years)	54

Descriptor

Test Items	67
Simulation	63
Item Response Theory	45
Test Bias	21
Models	20
Comparative Analysis	14
Computation	14
Error of Measurement	14
Evaluation Methods	13
Item Analysis	12
Statistical Analysis	12
Goodness of Fit	11
Sample Size	10
Factor Analysis	9
Test Length	9
Adaptive Testing	8
Classification	8
Computer Assisted Testing	8
Maximum Likelihood Statistics	8
Accuracy	7
Correlation	7
Difficulty Level	7
Measurement	7
Measurement Techniques	7
Probability	7
More ▼

Source

Educational and Psychological…

Publication Type

Journal Articles	67
Reports - Research	50
Reports - Evaluative	14
Speeches/Meeting Papers	3
Reports - Descriptive	2

Education Level

Junior High Schools	2
Middle Schools	2
Secondary Education	2
Early Childhood Education	1
Elementary Education	1
Grade 4	1
Grade 9	1
High Schools	1
Intermediate Grades	1
Preschool Education	1

Audience

Location

Florida	1
Germany	1
Taiwan	1

Laws, Policies, & Programs

Assessments and Surveys

Raven Advanced Progressive…	2
Florida Comprehensive…	1
Graduate Record Examinations	1
National Assessment of…	1
Wechsler Adult Intelligence…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 67 results Save | Export

Effects of the Quantity and Magnitude of Cross-Loading and Model Specification on MIRT Item Parameter Recovery

Peer reviewed

Direct link

Mostafa Hosseinzadeh; Ki Lynn Matlock Cole – Educational and Psychological Measurement, 2024

In real-world situations, multidimensional data may appear on large-scale tests or psychological surveys. The purpose of this study was to investigate the effects of the quantity and magnitude of cross-loadings and model specification on item parameter recovery in multidimensional Item Response Theory (MIRT) models, especially when the model was…

Descriptors: Item Response Theory, Models, Maximum Likelihood Statistics, Algorithms

Identifying Problematic Item Characteristics with Small Samples Using Mokken Scale Analysis

Peer reviewed

Direct link

Wind, Stefanie A. – Educational and Psychological Measurement, 2022

Researchers frequently use Mokken scale analysis (MSA), which is a nonparametric approach to item response theory, when they have relatively small samples of examinees. Researchers have provided some guidance regarding the minimum sample size for applications of MSA under various conditions. However, these studies have not focused on item-level…

Descriptors: Nonparametric Statistics, Item Response Theory, Sample Size, Test Items

Assessing Dimensionality of IRT Models Using Traditional and Revised Parallel Analyses

Peer reviewed

Direct link

Guo, Wenjing; Choi, Youn-Jeng – Educational and Psychological Measurement, 2023

Determining the number of dimensions is extremely important in applying item response theory (IRT) models to data. Traditional and revised parallel analyses have been proposed within the factor analysis framework, and both have shown some promise in assessing dimensionality. However, their performance in the IRT framework has not been…

Descriptors: Item Response Theory, Evaluation Methods, Factor Analysis, Guidelines

Investigating Heterogeneity in Response Strategies: A Mixture Multidimensional IRTree Approach

Peer reviewed

Direct link

Ö. Emre C. Alagöz; Thorsten Meiser – Educational and Psychological Measurement, 2024

To improve the validity of self-report measures, researchers should control for response style (RS) effects, which can be achieved with IRTree models. A traditional IRTree model considers a response as a combination of distinct decision-making processes, where the substantive trait affects the decision on response direction, while decisions about…

Descriptors: Item Response Theory, Validity, Self Evaluation (Individuals), Decision Making

A New Stopping Criterion for Rasch Trees Based on the Mantel-Haenszel Effect Size Measure for Differential Item Functioning

Peer reviewed

Direct link

Henninger, Mirka; Debelak, Rudolf; Strobl, Carolin – Educational and Psychological Measurement, 2023

To detect differential item functioning (DIF), Rasch trees search for optimal split-points in covariates and identify subgroups of respondents in a data-driven way. To determine whether and in which covariate a split should be performed, Rasch trees use statistical significance tests. Consequently, Rasch trees are more likely to label small DIF…

Descriptors: Item Response Theory, Test Items, Effect Size, Statistical Significance

Investigating Confidence Intervals of Item Parameters When Some Item Parameters Take Priors in the 2PL and 3PL Models

Peer reviewed

Direct link

Paek, Insu; Lin, Zhongtian; Chalmers, Robert Philip – Educational and Psychological Measurement, 2023

To reduce the chance of Heywood cases or nonconvergence in estimating the 2PL or the 3PL model in the marginal maximum likelihood with the expectation-maximization (MML-EM) estimation method, priors for the item slope parameter in the 2PL model or for the pseudo-guessing parameter in the 3PL model can be used and the marginal maximum a posteriori…

Descriptors: Models, Item Response Theory, Test Items, Intervals

Relative Robustness of CDMs and (M)IRT in Measuring Growth in Latent Skills

Peer reviewed

Direct link

Huang, Qi; Bolt, Daniel M. – Educational and Psychological Measurement, 2023

Previous studies have demonstrated evidence of latent skill continuity even in tests intentionally designed for measurement of binary skills. In addition, the assumption of binary skills when continuity is present has been shown to potentially create a lack of invariance in item and latent ability parameters that may undermine applications. In…

Descriptors: Item Response Theory, Test Items, Skill Development, Robustness (Statistics)

Robustness of Adaptive Measurement of Change to Item Parameter Estimation Error

Peer reviewed

Direct link

Cooperman, Allison W.; Weiss, David J.; Wang, Chun – Educational and Psychological Measurement, 2022

Adaptive measurement of change (AMC) is a psychometric method for measuring intra-individual change on one or more latent traits across testing occasions. Three hypothesis tests--a Z test, likelihood ratio test, and score ratio index--have demonstrated desirable statistical properties in this context, including low false positive rates and high…

Descriptors: Error of Measurement, Psychometrics, Hypothesis Testing, Simulation

Diagnostic Classification Model for Forced-Choice Items and Noncognitive Tests

Peer reviewed

Direct link

Huang, Hung-Yu – Educational and Psychological Measurement, 2023

The forced-choice (FC) item formats used for noncognitive tests typically develop a set of response options that measure different traits and instruct respondents to make judgments among these options in terms of their preference to control the response biases that are commonly observed in normative tests. Diagnostic classification models (DCMs)…

Descriptors: Test Items, Classification, Bayesian Statistics, Decision Making

Hybrid Threshold-Based Sequential Procedures for Detecting Compromised Items in a Computerized Adaptive Testing Licensure Exam

Peer reviewed

Direct link

Lee, Chansoon; Qian, Hong – Educational and Psychological Measurement, 2022

Using classical test theory and item response theory, this study applied sequential procedures to a real operational item pool in a variable-length computerized adaptive testing (CAT) to detect items whose security may be compromised. Moreover, this study proposed a hybrid threshold approach to improve the detection power of the sequential…

Descriptors: Computer Assisted Testing, Adaptive Testing, Licensing Examinations (Professions), Item Response Theory

A Regression Discontinuity Design Framework for Controlling Selection Bias in Evaluations of Differential Item Functioning

Peer reviewed

Direct link

Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022

Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…

Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations

Developing Multistage Tests Using "D"-Scoring Method

Peer reviewed

Direct link

Han, Kyung T.; Dimitrov, Dimiter M.; Al-Mashary, Faisal – Educational and Psychological Measurement, 2019

The "D"-scoring method for scoring and equating tests with binary items proposed by Dimitrov offers some of the advantages of item response theory, such as item-level difficulty information and score computation that reflects the item difficulties, while retaining the merits of classical test theory such as the simplicity of number…

Descriptors: Test Construction, Scoring, Test Items, Adaptive Testing

A Log-Linear Modeling Approach for Differential Item Functioning Detection in Polytomously Scored Items

Peer reviewed

Direct link

Yesiltas, Gonca; Paek, Insu – Educational and Psychological Measurement, 2020

A log-linear model (LLM) is a well-known statistical method to examine the relationship among categorical variables. This study investigated the performance of LLM in detecting differential item functioning (DIF) for polytomously scored items via simulations where various sample sizes, ability mean differences (impact), and DIF types were…

Descriptors: Simulation, Sample Size, Item Analysis, Scores

A Graphical Method for Displaying the Model Fit of Item Response Theory Trace Lines

Peer reviewed

Direct link

Kalinowski, Steven T. – Educational and Psychological Measurement, 2019

Item response theory (IRT) is a statistical paradigm for developing educational tests and assessing students. IRT, however, currently lacks an established graphical method for examining model fit for the three-parameter logistic model, the most flexible and popular IRT model in educational testing. A method is presented here to do this. The graph,…

Descriptors: Item Response Theory, Educational Assessment, Goodness of Fit, Probability

A Bayesian Random Block Item Response Theory Model for Forced-Choice Formats

Peer reviewed

Direct link

Lee, HyeSun; Smith, Weldon Z. – Educational and Psychological Measurement, 2020

Based on the framework of testlet models, the current study suggests the Bayesian random block item response theory (BRB IRT) model to fit forced-choice formats where an item block is composed of three or more items. To account for local dependence among items within a block, the BRB IRT model incorporated a random block effect into the response…

Descriptors: Bayesian Statistics, Item Response Theory, Monte Carlo Methods, Test Format

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Wang, Wen-Chung	6
Paek, Insu	3
Smith, Richard M.	3
Chang, Hua-Hua	2
Debelak, Rudolf	2
Douglas, Jeffrey	2
Huang, Hung-Yu	2
Schweizer, Karl	2
Shih, Ching-Lin	2
Strobl, Carolin	2
Walker, Cindy M.	2
Weiss, David J.	2
Wilson, Mark	2
Al-Mashary, Faisal	1
Andersson, Björn	1
Arendasy, Martin	1
Bacon, Donald R.	1
Banks, Kathleen	1
Batinic, Bernad	1
Beretvas, S. Natasha	1
Bolt, Daniel M.	1
Brooks, Gordon P.	1
Cappaert, Kevin	1
Carvajal, Jorge	1
More ▼