NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 24 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024
To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…
Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Kyung Yong; Lee, Won-Chan – Applied Measurement in Education, 2017
This article provides a detailed description of three factors (specification of the ability distribution, numerical integration, and frame of reference for the item parameter estimates) that might affect the item parameter estimation of the three-parameter logistic model, and compares five item calibration methods, which are combinations of the…
Descriptors: Test Items, Item Response Theory, Comparative Analysis, Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Finch, Holmes; French, Brian F. – Applied Measurement in Education, 2019
The usefulness of item response theory (IRT) models depends, in large part, on the accuracy of item and person parameter estimates. For the standard 3 parameter logistic model, for example, these parameters include the item parameters of difficulty, discrimination, and pseudo-chance, as well as the person ability parameter. Several factors impact…
Descriptors: Item Response Theory, Accuracy, Test Items, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Finch, W. Holmes – Applied Measurement in Education, 2016
Differential item functioning (DIF) assessment is a crucial component in test construction, serving as the primary way in which instrument developers ensure that measures perform in the same way for multiple groups within the population. When such is not the case, scores may not accurately reflect the trait of interest for all individuals in the…
Descriptors: Test Bias, Monte Carlo Methods, Comparative Analysis, Population Groups
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Wooyeol; Cho, Sun-Joo – Applied Measurement in Education, 2017
Utilizing a longitudinal item response model, this study investigated the effect of item parameter drift (IPD) on item parameters and person scores via a Monte Carlo study. Item parameter recovery was investigated for various IPD patterns in terms of bias and root mean-square error (RMSE), and percentage of time the 95% confidence interval covered…
Descriptors: Item Response Theory, Test Items, Bias, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Liang, Tie; Wells, Craig S. – Applied Measurement in Education, 2015
Investigating the fit of a parametric model plays a vital role in validating an item response theory (IRT) model. An area that has received little attention is the assessment of multiple IRT models used in a mixed-format test. The present study extends the nonparametric approach, proposed by Douglas and Cohen (2001), to assess model fit of three…
Descriptors: Nonparametric Statistics, Goodness of Fit, Item Response Theory, Test Format
Peer reviewed Peer reviewed
Direct linkDirect link
Dadey, Nathan; Lyons, Susan; DePascale, Charles – Applied Measurement in Education, 2018
Evidence of comparability is generally needed whenever there are variations in the conditions of an assessment administration, including variations introduced by the administration of an assessment on multiple digital devices (e.g., tablet, laptop, desktop). This article is meant to provide a comprehensive examination of issues relevant to the…
Descriptors: Evaluation Methods, Computer Assisted Testing, Educational Technology, Technology Uses in Education
Peer reviewed Peer reviewed
Direct linkDirect link
Suh, Youngsuk; Talley, Anna E. – Applied Measurement in Education, 2015
This study compared and illustrated four differential distractor functioning (DDF) detection methods for analyzing multiple-choice items. The log-linear approach, two item response theory-model-based approaches with likelihood ratio tests, and the odds ratio approach were compared to examine the congruence among the four DDF detection methods.…
Descriptors: Test Bias, Multiple Choice Tests, Test Items, Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Koziol, Natalie A. – Applied Measurement in Education, 2016
Testlets, or groups of related items, are commonly included in educational assessments due to their many logistical and conceptual advantages. Despite their advantages, testlets introduce complications into the theory and practice of educational measurement. Responses to items within a testlet tend to be correlated even after controlling for…
Descriptors: Classification, Accuracy, Comparative Analysis, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Ho, Tsung-Han; Dodd, Barbara G. – Applied Measurement in Education, 2012
In this study we compared five item selection procedures using three ability estimation methods in the context of a mixed-format adaptive test based on the generalized partial credit model. The item selection procedures used were maximum posterior weighted information, maximum expected information, maximum posterior weighted Kullback-Leibler…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Selection
Peer reviewed Peer reviewed
Direct linkDirect link
Murphy, Daniel L.; Beretvas, S. Natasha – Applied Measurement in Education, 2015
This study examines the use of cross-classified random effects models (CCrem) and cross-classified multiple membership random effects models (CCMMrem) to model rater bias and estimate teacher effectiveness. Effect estimates are compared using CTT versus item response theory (IRT) scaling methods and three models (i.e., conventional multilevel…
Descriptors: Teacher Effectiveness, Comparative Analysis, Hierarchical Linear Modeling, Test Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010
Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…
Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Finch, Holmes; Stage, Alan Kirk; Monahan, Patrick – Applied Measurement in Education, 2008
A primary assumption underlying several of the common methods for modeling item response data is unidimensionality, that is, test items tap into only one latent trait. This assumption can be assessed several ways, using nonlinear factor analysis and DETECT, a method based on the item conditional covariances. When multidimensionality is identified,…
Descriptors: Test Items, Factor Analysis, Item Response Theory, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Won-Chan; Ban, Jae-Chun – Applied Measurement in Education, 2010
Various applications of item response theory often require linking to achieve a common scale for item parameter estimates obtained from different groups. This article used a simulation to examine the relative performance of four different item response theory (IRT) linking procedures in a random groups equating design: concurrent calibration with…
Descriptors: Item Response Theory, Simulation, Comparative Analysis, Measurement Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Noell, Jay; Ginsburg, Alan – Applied Measurement in Education, 2009
The report, "Evaluation of the National Assessment of Educational Progress", provides a number of recommendations for addressing validity concerns about NAEP. This article identifies actions that could be taken by the Congress, the National Center for Education Statistics, and the National Assessment Governing Board--which share responsibility for…
Descriptors: National Competency Tests, Federal Government, Public Agencies, Test Validity
Previous Page | Next Page ยป
Pages: 1  |  2