NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Type
Reports - Descriptive14
Journal Articles12
Audience
Researchers1
Location
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 14 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Wan, Siyu; Keller, Lisa A. – Practical Assessment, Research & Evaluation, 2023
Statistical process control (SPC) charts have been widely used in the field of educational measurement. The cumulative sum (CUSUM) is an established SPC method to detect aberrant responses for educational assessments. There are many studies that investigated the performance of CUSUM in different test settings. This paper describes the CUSUM…
Descriptors: Visual Aids, Educational Assessment, Evaluation Methods, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Fuchimoto, Kazuma; Ishii, Takatoshi; Ueno, Maomi – IEEE Transactions on Learning Technologies, 2022
Educational assessments often require uniform test forms, for which each test form has equivalent measurement accuracy but with a different set of items. For uniform test assembly, an important issue is the increase of the number of assembled uniform tests. Although many automatic uniform test assembly methods exist, the maximum clique algorithm…
Descriptors: Simulation, Efficiency, Test Items, Educational Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Yoo, Hanwook; Hambleton, Ronald K. – Educational Measurement: Issues and Practice, 2019
Item analysis is an integral part of operational test development and is typically conducted within two popular statistical frameworks: classical test theory (CTT) and item response theory (IRT). In this digital ITEMS module, Hanwook Yoo and Ronald K. Hambleton provide an accessible overview of operational item analysis approaches within these…
Descriptors: Item Analysis, Item Response Theory, Guidelines, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Tijmstra, Jesper; Bolsinova, Maria; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2020
Although the root-mean squared deviation (RMSD) is a popular statistical measure for evaluating country-specific item-level misfit (i.e., differential item functioning [DIF]) in international large-scale assessment, this paper shows that its sensitivity to detect misfit may depend strongly on the proficiency distribution of the considered…
Descriptors: Test Items, Goodness of Fit, Probability, Accuracy
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Aybek, Eren Can; Demirtasli, R. Nukhet – International Journal of Research in Education and Science, 2017
This article aims to provide a theoretical framework for computerized adaptive tests (CAT) and item response theory models for polytomous items. Besides that, it aims to introduce the simulation and live CAT software to the related researchers. Computerized adaptive test algorithm, assumptions of item response theory models, nominal response…
Descriptors: Computer Assisted Testing, Adaptive Testing, Item Response Theory, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Goldhaber, Dan; Chaplin, Duncan – Center for Education Data & Research, 2012
In a provocative and influential paper, Jesse Rothstein (2010) finds that standard value added models (VAMs) suggest implausible future teacher effects on past student achievement, a finding that obviously cannot be viewed as causal. This is the basis of a falsification test (the Rothstein falsification test) that appears to indicate bias in VAM…
Descriptors: School Effectiveness, Teacher Effectiveness, Achievement Gains, Statistical Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Wen-Chung; Huang, Sheng-Yun – Educational and Psychological Measurement, 2011
The one-parameter logistic model with ability-based guessing (1PL-AG) has been recently developed to account for effect of ability on guessing behavior in multiple-choice items. In this study, the authors developed algorithms for computerized classification testing under the 1PL-AG and conducted a series of simulations to evaluate their…
Descriptors: Computer Assisted Testing, Classification, Item Analysis, Probability
Peer reviewed Peer reviewed
Direct linkDirect link
Yurdugul, Halil – Applied Psychological Measurement, 2009
This article describes SIMREL, a software program designed for the simulation of alpha coefficients and the estimation of its confidence intervals. SIMREL runs on two alternatives. In the first one, if SIMREL is run for a single data file, it performs descriptive statistics, principal components analysis, and variance analysis of the item scores…
Descriptors: Intervals, Monte Carlo Methods, Computer Software, Factor Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Xiaohui; Bradlow, Eric T.; Wainer, Howard; Muller, Eric S. – Journal of Educational and Behavioral Statistics, 2008
In the course of screening a form of a medical licensing exam for items that function differentially (DIF) between men and women, the authors used the traditional Mantel-Haenszel (MH) statistic for initial screening and a Bayesian method for deeper analysis. For very easy items, the MH statistic unexpectedly often found DIF where there was none.…
Descriptors: Bayesian Statistics, Licensing Examinations (Professions), Medicine, Test Items
Kobrin, Jennifer L.; Schmidt, Amy Elizabeth – College Board, 2007
This report provides a brief summary of the research projects that have been conducted to support the development of the new SAT.
Descriptors: College Entrance Examinations, Educational Research, Educational Change, Research Projects
Peer reviewed Peer reviewed
Direct linkDirect link
Marsh, Herbert W.; Wen, Zhonglin; Hau, Kit-Tai – Psychological Methods, 2004
Interactions between (multiple indicator) latent variables are rarely used because of implementation complexity and competing strategies. Based on 4 simulation studies, the traditional constrained approach performed more poorly than did 3 new approaches-unconstrained, generalized appended product indicator, and quasi-maximum-likelihood (QML). The…
Descriptors: Structural Equation Models, Item Analysis, Error Patterns, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Tsai, Tien-Lung; Shau, Wen-Yi; Hu, Fu-Chang – Structural Equation Modeling: A Multidisciplinary Journal, 2006
This article generalizes linear path analysis (PA) and simultaneous equations models (SiEM) to deal with mixed responses of different types in a recursive or triangular system. An efficient instrumental variable (IV) method for estimating the structural coefficients of a 2-equation partially recursive generalized path analysis (GPA) model and…
Descriptors: Structural Equation Models, Path Analysis, Simulation, Equations (Mathematics)
Peer reviewed Peer reviewed
Direct linkDirect link
Segall, Daniel O. – Journal of Educational and Behavioral Statistics, 2004
A new sharing item response theory (SIRT) model is presented that explicitly models the effects of sharing item content between informants and test takers. This model is used to construct adaptive item selection and scoring rules that provide increased precision and reduced score gains in instances where sharing occurs. The adaptive item selection…
Descriptors: Scoring, Item Analysis, Item Response Theory, Adaptive Testing