ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	5
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	14

Descriptor

Simulation	23
Test Items	23
Item Response Theory	11
Item Analysis	7
Models	7
Computer Assisted Testing	6
Ability	5
Goodness of Fit	5
Maximum Likelihood Statistics	5
Adaptive Testing	4
Bayesian Statistics	4
Computation	4
Scores	4
Scoring	4
Statistical Analysis	4
Test Bias	4
Accuracy	3
Comparative Analysis	3
Correlation	3
Estimation (Mathematics)	3
Item Bias	3
Test Construction	3
Test Reliability	3
Achievement Tests	2
Efficiency	2
More ▼

Source

Journal of Educational and…

Publication Type

Journal Articles	23
Reports - Research	13
Reports - Descriptive	5
Reports - Evaluative	5
Speeches/Meeting Papers	1

Education Level

Secondary Education	2
Elementary Secondary Education	1
Grade 12	1
High Schools	1
Higher Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Behavioral Risk Factor…	1
Program for International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 23 results Save | Export

Analyzing Polytomous Test Data: A Comparison between an Information-Based IRT Model and the Generalized Partial Credit Model

Peer reviewed

Direct link

Joakim Wallmark; James O. Ramsay; Juan Li; Marie Wiberg – Journal of Educational and Behavioral Statistics, 2024

Item response theory (IRT) models the relationship between the possible scores on a test item against a test taker's attainment of the latent trait that the item is intended to measure. In this study, we compare two models for tests with polytomously scored items: the optimal scoring (OS) model, a nonparametric IRT model based on the principles of…

Descriptors: Item Response Theory, Test Items, Models, Scoring

Using Item Scores and Distractors to Detect Item Compromise and Preknowledge

Peer reviewed

Direct link

Gorney, Kylie; Wollack, James A.; Sinharay, Sandip; Eckerly, Carol – Journal of Educational and Behavioral Statistics, 2023

Any time examinees have had access to items and/or answers prior to taking a test, the fairness of the test and validity of test score interpretations are threatened. Therefore, there is a high demand for procedures to detect both compromised items (CI) and examinees with preknowledge (EWP). In this article, we develop a procedure that uses item…

Descriptors: Scores, Test Validity, Test Items, Prior Learning

Estimating Difference-Score Reliability in Pretest-Posttest Settings

Peer reviewed

Direct link

Gu, Zhengguo; Emons, Wilco H. M.; Sijtsma, Klaas – Journal of Educational and Behavioral Statistics, 2021

Clinical, medical, and health psychologists use difference scores obtained from pretest--posttest designs employing the same test to assess intraindividual change possibly caused by an intervention addressing, for example, anxiety, depression, eating disorder, or addiction. Reliability of difference scores is important for interpreting observed…

Descriptors: Test Reliability, Scores, Pretests Posttests, Computation

A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement

Peer reviewed

Direct link

Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024

Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…

Descriptors: Semantics, Educational Assessment, Evaluators, Reliability

Optimizing the Use of Response Times for Item Selection in Computerized Adaptive Testing

Peer reviewed

Direct link

Choe, Edison M.; Kern, Justin L.; Chang, Hua-Hua – Journal of Educational and Behavioral Statistics, 2018

Despite common operationalization, measurement efficiency of computerized adaptive testing should not only be assessed in terms of the number of items administered but also the time it takes to complete the test. To this end, a recent study introduced a novel item selection criterion that maximizes Fisher information per unit of expected response…

Descriptors: Computer Assisted Testing, Reaction Time, Item Response Theory, Test Items

Testing Latent Variable Distribution Fit in IRT Using Posterior Residuals

Peer reviewed

Direct link

Monroe, Scott – Journal of Educational and Behavioral Statistics, 2021

This research proposes a new statistic for testing latent variable distribution fit for unidimensional item response theory (IRT) models. If the typical assumption of normality is violated, then item parameter estimates will be biased, and dependent quantities such as IRT score estimates will be adversely affected. The proposed statistic compares…

Descriptors: Item Response Theory, Simulation, Scores, Comparative Analysis

Item Response Modeling of Multivariate Count Data with Zero Inflation, Maximum Inflation, and Heaping

Peer reviewed

Direct link

Magnus, Brooke E.; Thissen, David – Journal of Educational and Behavioral Statistics, 2017

Questionnaires that include items eliciting count responses are becoming increasingly common in psychology. This study proposes methodological techniques to overcome some of the challenges associated with analyzing multivariate item response data that exhibit zero inflation, maximum inflation, and heaping at preferred digits. The modeling…

Descriptors: Item Response Theory, Models, Multivariate Analysis, Questionnaires

Item Response Data Analysis Using Stata Item Response Theory Package

Peer reviewed

Direct link

Yang, Ji Seung; Zheng, Xiaying – Journal of Educational and Behavioral Statistics, 2018

The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…

Descriptors: Item Response Theory, Item Analysis, Computer Software, Statistical Analysis

A Comparative Study of Online Item Calibration Methods in Multidimensional Computerized Adaptive Testing

Peer reviewed

Direct link

Chen, Ping – Journal of Educational and Behavioral Statistics, 2017

Calibration of new items online has been an important topic in item replenishment for multidimensional computerized adaptive testing (MCAT). Several online calibration methods have been proposed for MCAT, such as multidimensional "one expectation-maximization (EM) cycle" (M-OEM) and multidimensional "multiple EM cycles"…

Descriptors: Test Items, Item Response Theory, Test Construction, Adaptive Testing

Modeling Information Accumulation in Psychological Tests Using Item Response Times

Peer reviewed

Direct link

Ranger, Jochen; Kuhn, Jörg-Tobias – Journal of Educational and Behavioral Statistics, 2015

In this article, a latent trait model is proposed for the response times in psychological tests. The latent trait model is based on the linear transformation model and subsumes popular models from survival analysis, like the proportional hazards model and the proportional odds model. Core of the model is the assumption that an unspecified monotone…

Descriptors: Psychological Testing, Reaction Time, Statistical Analysis, Models

Item-Weighted Likelihood Method for Ability Estimation in Tests Composed of Both Dichotomous and Polytomous Items

Peer reviewed

Direct link

Tao, Jian; Shi, Ning-Zhong; Chang, Hua-Hua – Journal of Educational and Behavioral Statistics, 2012

For mixed-type tests composed of both dichotomous and polytomous items, polytomous items often yield more information than dichotomous ones. To reflect the difference between the two types of items, polytomous items are usually pre-assigned with larger weights. We propose an item-weighted likelihood method to better assess examinees' ability…

Descriptors: Test Items, Weighted Scores, Maximum Likelihood Statistics, Statistical Bias

Screening Test Items for Differential Item Functioning

Peer reviewed

Direct link

Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014

A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…

Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing

Robust Estimation of Latent Ability in Item Response Models

Peer reviewed

Direct link

Schuster, Christof; Yuan, Ke-Hai – Journal of Educational and Behavioral Statistics, 2011

Because of response disturbances such as guessing, cheating, or carelessness, item response models often can only approximate the "true" individual response probabilities. As a consequence, maximum-likelihood estimates of ability will be biased. Typically, the nature and extent to which response disturbances are present is unknown, and, therefore,…

Descriptors: Computation, Item Response Theory, Probability, Maximum Likelihood Statistics

A Bayesian Method for Studying DIF: A Cautionary Tale Filled with Surprises and Delights

Peer reviewed

Direct link

Wang, Xiaohui; Bradlow, Eric T.; Wainer, Howard; Muller, Eric S. – Journal of Educational and Behavioral Statistics, 2008

In the course of screening a form of a medical licensing exam for items that function differentially (DIF) between men and women, the authors used the traditional Mantel-Haenszel (MH) statistic for initial screening and a Bayesian method for deeper analysis. For very easy items, the MH statistic unexpectedly often found DIF where there was none.…

Descriptors: Bayesian Statistics, Licensing Examinations (Professions), Medicine, Test Items

Using Loss Functions for DIF Detection: An Empirical Bayes Approach.

Peer reviewed

Zwick, Rebecca; Thayer, Dorothy; Lewis, Charles – Journal of Educational and Behavioral Statistics, 2000

Studied a method for flagging differential item functioning (DIF) based on loss functions. Builds on earlier research that led to the development of an empirical Bayes enhancement to the Mantel-Haenszel DIF analysis. Tested the method through simulation and found its performance better than some commonly used DIF classification systems. (SLD)

Descriptors: Bayesian Statistics, Identification, Item Bias, Simulation

Previous Page | Next Page »

Pages: 1 | 2

Chang, Hua-Hua	2
Segall, Daniel O.	2
Thissen, David	2
Veerkamp, Wim J. J.	2
Zwick, Rebecca	2
Allan S. Cohen	1
Berger, Martijn P. F.	1
Bradlow, Eric T.	1
Chen, Ping	1
Chen, Wen-Hung	1
Choe, Edison M.	1
Douglas, Jeffrey A.	1
Eckerly, Carol	1
Emons, Wilco H. M.	1
Gorney, Kylie	1
Gu, Zhengguo	1
James O. Ramsay	1
Joakim Wallmark	1
Jordan M. Wheeler	1
Juan Li	1
Kern, Justin L.	1
Kuhn, Jörg-Tobias	1
Lewis, Charles	1
Longford, Nicholas T.	1
More ▼