Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 9 |
Since 2006 (last 20 years) | 24 |
Descriptor
Comparative Analysis | 40 |
Simulation | 30 |
Item Response Theory | 23 |
Test Items | 15 |
Computer Simulation | 10 |
Computer Assisted Testing | 9 |
Models | 9 |
Evaluation Methods | 8 |
Computation | 7 |
Sample Size | 6 |
Scores | 6 |
More ▼ |
Source
Journal of Educational… | 40 |
Author
Kim, Seonghoon | 2 |
Nandakumar, Ratna | 2 |
Suh, Youngsuk | 2 |
Bejar, Isaac I. | 1 |
Burket, George R. | 1 |
Cai, Li | 1 |
Chang, Hua-Hua | 1 |
Chen, Ping | 1 |
Chen, Shu-Ying | 1 |
Cho, Sun-Joo | 1 |
Choi, Seung W. | 1 |
More ▼ |
Publication Type
Journal Articles | 40 |
Reports - Research | 25 |
Reports - Evaluative | 15 |
Speeches/Meeting Papers | 4 |
Education Level
Elementary Secondary Education | 1 |
Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Indiana Statewide Testing for… | 1 |
National Assessment of… | 1 |
Program for International… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Zhang, Xue; Tao, Jian; Wang, Chun; Shi, Ning-Zhong – Journal of Educational Measurement, 2019
Model selection is important in any statistical analysis, and the primary goal is to find the preferred (or most parsimonious) model, based on certain criteria, from a set of candidate models given data. Several recent publications have employed the deviance information criterion (DIC) to do model selection among different forms of multilevel item…
Descriptors: Bayesian Statistics, Item Response Theory, Measurement, Models
Zhang, Zhonghua; Zhao, Mingren – Journal of Educational Measurement, 2019
The present study evaluated the multiple imputation method, a procedure that is similar to the one suggested by Li and Lissitz (2004), and compared the performance of this method with that of the bootstrap method and the delta method in obtaining the standard errors for the estimates of the parameter scale transformation coefficients in item…
Descriptors: Item Response Theory, Error Patterns, Item Analysis, Simulation
Luo, Xiao; Kim, Doyoung – Journal of Educational Measurement, 2018
The top-down approach to designing a multistage test is relatively understudied in the literature and underused in research and practice. This study introduced a route-based top-down design approach that directly sets design parameters at the test level and utilizes the advanced automated test assembly algorithm seeking global optimality. The…
Descriptors: Computer Assisted Testing, Test Construction, Decision Making, Simulation
Feuerstahler, Leah; Wilson, Mark – Journal of Educational Measurement, 2019
Scores estimated from multidimensional item response theory (IRT) models are not necessarily comparable across dimensions. In this article, the concept of aligned dimensions is formalized in the context of Rasch models, and two methods are described--delta dimensional alignment (DDA) and logistic regression alignment (LRA)--to transform estimated…
Descriptors: Item Response Theory, Models, Scores, Comparative Analysis
Fitzpatrick, Joseph; Skorupski, William P. – Journal of Educational Measurement, 2016
The equating performance of two internal anchor test structures--miditests and minitests--is studied for four IRT equating methods using simulated data. Originally proposed by Sinharay and Holland, miditests are anchors that have the same mean difficulty as the overall test but less variance in item difficulties. Four popular IRT equating methods…
Descriptors: Difficulty Level, Test Items, Comparative Analysis, Test Construction
Lee, Soo; Suh, Youngsuk – Journal of Educational Measurement, 2018
Lord's Wald test for differential item functioning (DIF) has not been studied extensively in the context of the multidimensional item response theory (MIRT) framework. In this article, Lord's Wald test was implemented using two estimation approaches, marginal maximum likelihood estimation and Bayesian Markov chain Monte Carlo estimation, to detect…
Descriptors: Item Response Theory, Sample Size, Models, Error of Measurement
Wang, Wenyi; Song, Lihong; Chen, Ping; Meng, Yaru; Ding, Shuliang – Journal of Educational Measurement, 2015
Classification consistency and accuracy are viewed as important indicators for evaluating the reliability and validity of classification results in cognitive diagnostic assessment (CDA). Pattern-level classification consistency and accuracy indices were introduced by Cui, Gierl, and Chang. However, the indices at the attribute level have not yet…
Descriptors: Classification, Reliability, Accuracy, Cognitive Tests
Häggström, Jenny; Wiberg, Marie – Journal of Educational Measurement, 2014
The selection of bandwidth in kernel equating is important because it has a direct impact on the equated test scores. The aim of this article is to examine the use of double smoothing when selecting bandwidths in kernel equating and to compare double smoothing with the commonly used penalty method. This comparison was made using both an equivalent…
Descriptors: Equated Scores, Data Analysis, Comparative Analysis, Simulation
Moses, Tim – Journal of Educational Measurement, 2013
The purpose of this study was to evaluate the use of adjoined and piecewise linear approximations (APLAs) of raw equipercentile equating functions as a postsmoothing equating method. APLAs are less familiar than other postsmoothing equating methods (i.e., cubic splines), but their use has been described in historical equating practices of…
Descriptors: Equated Scores, Accuracy, Simulation, Comparative Analysis
Falk, Carl F.; Cai, Li – Journal of Educational Measurement, 2016
We present a logistic function of a monotonic polynomial with a lower asymptote, allowing additional flexibility beyond the three-parameter logistic model. We develop a maximum marginal likelihood-based approach to estimate the item parameters. The new item response model is demonstrated on math assessment data from a state, and a computationally…
Descriptors: Item Response Theory, Guessing (Tests), Mathematics Tests, Simulation
Sinharay, Sandip; Wan, Ping; Choi, Seung W.; Kim, Dong-In – Journal of Educational Measurement, 2015
With an increase in the number of online tests, the number of interruptions during testing due to unexpected technical issues seems to be on the rise. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. Researchers such as…
Descriptors: Computer Assisted Testing, Testing Problems, Scores, Statistical Analysis
Tendeiro, Jorge N.; Meijer, Rob R. – Journal of Educational Measurement, 2014
In recent guidelines for fair educational testing it is advised to check the validity of individual test scores through the use of person-fit statistics. For practitioners it is unclear on the basis of the existing literature which statistic to use. An overview of relatively simple existing nonparametric approaches to identify atypical response…
Descriptors: Educational Assessment, Test Validity, Scores, Statistical Analysis
Pohl, Steffi – Journal of Educational Measurement, 2013
This article introduces longitudinal multistage testing (lMST), a special form of multistage testing (MST), as a method for adaptive testing in longitudinal large-scale studies. In lMST designs, test forms of different difficulty levels are used, whereas the values on a pretest determine the routing to these test forms. Since lMST allows for…
Descriptors: Adaptive Testing, Longitudinal Studies, Difficulty Level, Comparative Analysis
Zhang, Jinming; Li, Jie – Journal of Educational Measurement, 2016
An IRT-based sequential procedure is developed to monitor items for enhancing test security. The procedure uses a series of statistical hypothesis tests to examine whether the statistical characteristics of each item under inspection have changed significantly during CAT administration. This procedure is compared with a previously developed…
Descriptors: Computer Assisted Testing, Test Items, Difficulty Level, Item Response Theory
Hou, Likun; de la Torre, Jimmy; Nandakumar, Ratna – Journal of Educational Measurement, 2014
Analyzing examinees' responses using cognitive diagnostic models (CDMs) has the advantage of providing diagnostic information. To ensure the validity of the results from these models, differential item functioning (DIF) in CDMs needs to be investigated. In this article, the Wald test is proposed to examine DIF in the context of CDMs. This study…
Descriptors: Test Bias, Models, Simulation, Error Patterns