NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 7 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Gorney, Kylie; Wollack, James A. – Journal of Educational Measurement, 2023
In order to detect a wide range of aberrant behaviors, it can be useful to incorporate information beyond the dichotomous item scores. In this paper, we extend the l[subscript z] and l*[subscript z] person-fit statistics so that unusual behavior in item scores and unusual behavior in item distractors can be used as indicators of aberrance. Through…
Descriptors: Test Items, Scores, Goodness of Fit, Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Patton, Jeffrey M.; Cheng, Ying; Hong, Maxwell; Diao, Qi – Journal of Educational and Behavioral Statistics, 2019
In psychological and survey research, the prevalence and serious consequences of careless responses from unmotivated participants are well known. In this study, we propose to iteratively detect careless responders and cleanse the data by removing their responses. The careless responders are detected using person-fit statistics. In two simulation…
Descriptors: Test Items, Response Style (Tests), Identification, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Zhang, Jinming; Li, Jie – Journal of Educational Measurement, 2016
An IRT-based sequential procedure is developed to monitor items for enhancing test security. The procedure uses a series of statistical hypothesis tests to examine whether the statistical characteristics of each item under inspection have changed significantly during CAT administration. This procedure is compared with a previously developed…
Descriptors: Computer Assisted Testing, Test Items, Difficulty Level, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Zu, Jiyun; Yuan, Ke-Hai – Journal of Educational Measurement, 2012
In the nonequivalent groups with anchor test (NEAT) design, the standard error of linear observed-score equating is commonly estimated by an estimator derived assuming multivariate normality. However, real data are seldom normally distributed, causing this normal estimator to be inconsistent. A general estimator, which does not rely on the…
Descriptors: Sample Size, Equated Scores, Test Items, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Cui, Zhongmin; Kolen, Michael J. – Applied Psychological Measurement, 2008
This article considers two methods of estimating standard errors of equipercentile equating: the parametric bootstrap method and the nonparametric bootstrap method. Using a simulation study, these two methods are compared under three sample sizes (300, 1,000, and 3,000), for two test content areas (the Iowa Tests of Basic Skills Maps and Diagrams…
Descriptors: Test Length, Test Content, Simulation, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Ferrao, Maria – Assessment & Evaluation in Higher Education, 2010
The Bologna Declaration brought reforms into higher education that imply changes in teaching methods, didactic materials and textbooks, infrastructures and laboratories, etc. Statistics and mathematics are disciplines that traditionally have the worst success rates, particularly in non-mathematics core curricula courses. This research project,…
Descriptors: Foreign Countries, Computer Assisted Testing, Educational Technology, Educational Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Wen-Chung; Chen, Cheng-Te – Educational and Psychological Measurement, 2005
This study investigates item parameter recovery, standard error estimates, and fit statistics yielded by the WINSTEPS program under the Rasch model and the rating scale model through Monte Carlo simulations. The independent variables were item response model, test length, and sample size. WINSTEPS yielded practically unbiased estimates for the…
Descriptors: Statistics, Test Length, Rating Scales, Item Response Theory