ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	13

Descriptor

Bayesian Statistics	17
Test Items	17
Item Response Theory	8
Models	6
Monte Carlo Methods	6
Markov Processes	5
Statistical Analysis	5
Item Analysis	4
Simulation	4
Accuracy	3
Adaptive Testing	3
Computation	3
Computer Assisted Testing	3
Responses	3
Test Construction	3
Evaluation Methods	2
Goodness of Fit	2
Program Effectiveness	2
Reaction Time	2
Scores	2
Test Bias	2
Ability	1
Achievement Tests	1
Admission (School)	1
Cognitive Measurement	1
More ▼

Source

Journal of Educational and…

Publication Type

Journal Articles	17
Reports - Research	8
Reports - Evaluative	5
Reports - Descriptive	4

Education Level

Higher Education	3
Grade 8	1
Middle Schools	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Program for International…

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

Extending an Identified Four-Parameter IRT Model: The Confirmatory Set-4PNO Model

Peer reviewed

Direct link

Justin L. Kern – Journal of Educational and Behavioral Statistics, 2024

Given the frequent presence of slipping and guessing in item responses, models for the inclusion of their effects are highly important. Unfortunately, the most common model for their inclusion, the four-parameter item response theory model, potentially has severe deficiencies related to its possible unidentifiability. With this issue in mind, the…

Descriptors: Item Response Theory, Models, Bayesian Statistics, Generalization

Item Pool Quality Control in Educational Testing: Change Point Model, Compound Risk, and Sequential Detection

Peer reviewed

Direct link

Chen, Yunxiao; Lee, Yi-Hsuan; Li, Xiaoou – Journal of Educational and Behavioral Statistics, 2022

In standardized educational testing, test items are reused in multiple test administrations. To ensure the validity of test scores, the psychometric properties of items should remain unchanged over time. In this article, we consider the sequential monitoring of test items, in particular, the detection of abrupt changes to their psychometric…

Descriptors: Standardized Tests, Test Items, Test Validity, Scores

Using JAGS for Bayesian Cognitive Diagnosis Modeling: A Tutorial

Peer reviewed

Direct link

Zhan, Peida; Jiao, Hong; Man, Kaiwen; Wang, Lijun – Journal of Educational and Behavioral Statistics, 2019

In this article, we systematically introduce the just another Gibbs sampler (JAGS) software program to fit common Bayesian cognitive diagnosis models (CDMs) including the deterministic inputs, noisy "and" gate model; the deterministic inputs, noisy "or" gate model; the linear logistic model; the reduced reparameterized unified…

Descriptors: Bayesian Statistics, Computer Software, Models, Test Items

A Bayesian Item Response Model for Examining Item Position Effects in Complex Survey Data

Peer reviewed

Direct link

Trendtel, Matthias; Robitzsch, Alexander – Journal of Educational and Behavioral Statistics, 2021

A multidimensional Bayesian item response model is proposed for modeling item position effects. The first dimension corresponds to the ability that is to be measured; the second dimension represents a factor that allows for individual differences in item position effects called persistence. This model allows for nonlinear item position effects on…

Descriptors: Bayesian Statistics, Item Response Theory, Test Items, Test Format

Detection of Differential Item Functioning Using the Lasso Approach

Peer reviewed

Direct link

Magis, David; Tuerlinckx, Francis; De Boeck, Paul – Journal of Educational and Behavioral Statistics, 2015

This article proposes a novel approach to detect differential item functioning (DIF) among dichotomously scored items. Unlike standard DIF methods that perform an item-by-item analysis, we propose the "LR lasso DIF method": logistic regression (LR) model is formulated for all item responses. The model contains item-specific intercepts,…

Descriptors: Test Bias, Test Items, Regression (Statistics), Scores

A Comparative Study of Online Item Calibration Methods in Multidimensional Computerized Adaptive Testing

Peer reviewed

Direct link

Chen, Ping – Journal of Educational and Behavioral Statistics, 2017

Calibration of new items online has been an important topic in item replenishment for multidimensional computerized adaptive testing (MCAT). Several online calibration methods have been proposed for MCAT, such as multidimensional "one expectation-maximization (EM) cycle" (M-OEM) and multidimensional "multiple EM cycles"…

Descriptors: Test Items, Item Response Theory, Test Construction, Adaptive Testing

Improving Mantel-Haenszel DIF Estimation through Bayesian Updating

Peer reviewed

Direct link

Zwick, Rebecca; Ye, Lei; Isham, Steven – Journal of Educational and Behavioral Statistics, 2012

This study demonstrates how the stability of Mantel-Haenszel (MH) DIF (differential item functioning) methods can be improved by integrating information across multiple test administrations using Bayesian updating (BU). The authors conducted a simulation that showed that this approach, which is based on earlier work by Zwick, Thayer, and Lewis,…

Descriptors: Test Bias, Computation, Statistical Analysis, Bayesian Statistics

A Semiparametric Model for Jointly Analyzing Response Times and Accuracy in Computerized Testing

Peer reviewed

Direct link

Wang, Chun; Fan, Zhewen; Chang, Hua-Hua; Douglas, Jeffrey A. – Journal of Educational and Behavioral Statistics, 2013

The item response times (RTs) collected from computerized testing represent an underutilized type of information about items and examinees. In addition to knowing the examinees' responses to each item, we can investigate the amount of time examinees spend on each item. Current models for RTs mainly focus on parametric models, which have the…

Descriptors: Reaction Time, Computer Assisted Testing, Test Items, Accuracy

A Bayesian Method for Studying DIF: A Cautionary Tale Filled with Surprises and Delights

Peer reviewed

Direct link

Wang, Xiaohui; Bradlow, Eric T.; Wainer, Howard; Muller, Eric S. – Journal of Educational and Behavioral Statistics, 2008

In the course of screening a form of a medical licensing exam for items that function differentially (DIF) between men and women, the authors used the traditional Mantel-Haenszel (MH) statistic for initial screening and a Bayesian method for deeper analysis. For very easy items, the MH statistic unexpectedly often found DIF where there was none.…

Descriptors: Bayesian Statistics, Licensing Examinations (Professions), Medicine, Test Items

Using Response Times for Item Selection in Adaptive Testing

Peer reviewed

Direct link

van der Linden, Wim J. – Journal of Educational and Behavioral Statistics, 2008

Response times on items can be used to improve item selection in adaptive testing provided that a probabilistic model for their distribution is available. In this research, the author used a hierarchical modeling framework with separate first-level models for the responses and response times and a second-level model for the distribution of the…

Descriptors: Reaction Time, Law Schools, Adaptive Testing, Item Analysis

Using Loss Functions for DIF Detection: An Empirical Bayes Approach.

Peer reviewed

Zwick, Rebecca; Thayer, Dorothy; Lewis, Charles – Journal of Educational and Behavioral Statistics, 2000

Studied a method for flagging differential item functioning (DIF) based on loss functions. Builds on earlier research that led to the development of an empirical Bayes enhancement to the Mantel-Haenszel DIF analysis. Tested the method through simulation and found its performance better than some commonly used DIF classification systems. (SLD)

Descriptors: Bayesian Statistics, Identification, Item Bias, Simulation

Covariates of the Rating Process in Hierarchical Models for Multiple Ratings of Test Items

Peer reviewed

Direct link

Mariano, Louis T.; Junker, Brian W. – Journal of Educational and Behavioral Statistics, 2007

When constructed response test items are scored by more than one rater, the repeated ratings allow for the consideration of individual rater bias and variability in estimating student proficiency. Several hierarchical models based on item response theory have been introduced to model such effects. In this article, the authors demonstrate how these…

Descriptors: Test Items, Item Response Theory, Rating Scales, Scoring

Calibrating Item Families and Summarizing the Results Using Family Expected Response Functions

Peer reviewed

Direct link

Sinharay, Sandip; Johnson, Matthew S.; Williamson, David M. – Journal of Educational and Behavioral Statistics, 2003

Item families, which are groups of related items, are becoming increasingly popular in complex educational assessments. For example, in automatic item generation (AIG) systems, a test may consist of multiple items generated from each of a number of item models. Item calibration or scoring for such an assessment requires fitting models that can…

Descriptors: Test Items, Markov Processes, Educational Testing, Probability

Applications and Extensions of MCMC in IRT: Multiple Item Types, Missing Data, and Rated Responses.

Peer reviewed

Patz, Richard J.; Junker, Brian W. – Journal of Educational and Behavioral Statistics, 1999

Extends the basic Markov chain Monte Carlo (MCMC) strategy of R. Patz and B. Junker (1999) for Bayesian inference in complex Item Response Theory settings to address issues such as nonresponse, designed missingness, multiple raters, guessing behaviors, and partial credit (polytomous) test items. Applies the MCMC method to data from the National…

Descriptors: Bayesian Statistics, Item Response Theory, Markov Processes, Monte Carlo Methods

Some New Item Selection Criteria for Adaptive Testing.

Peer reviewed

Berger, Martijn P. F.; Veerkamp, Wim J. J. – Journal of Educational and Behavioral Statistics, 1997

Some alternative criteria for item selection in adaptive testing are proposed that take into account uncertainty in the ability estimates. A simulation study shows that the likelihood weighted information criterion is a good alternative to the maximum information criterion. Another good alternative uses a Bayesian expected a posteriori estimator.…

Descriptors: Ability, Adaptive Testing, Bayesian Statistics, Computer Assisted Testing

Previous Page | Next Page »

Pages: 1 | 2

Junker, Brian W.	2
Sinharay, Sandip	2
Zwick, Rebecca	2
Berger, Martijn P. F.	1
Bradlow, Eric T.	1
Chang, Hua-Hua	1
Chen, Ping	1
Chen, Yunxiao	1
De Boeck, Paul	1
Douglas, Jeffrey A.	1
Fan, Zhewen	1
Isham, Steven	1
Jiao, Hong	1
Johnson, Matthew S.	1
Justin L. Kern	1
Lee, Yi-Hsuan	1
Lewis, Charles	1
Li, Xiaoou	1
Magis, David	1
Man, Kaiwen	1
Mariano, Louis T.	1
May, Henry	1
Muller, Eric S.	1
Patz, Richard J.	1
Robitzsch, Alexander	1
More ▼