NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 11 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Xiao, Leifeng; Hau, Kit-Tai – Applied Measurement in Education, 2023
We compared coefficient alpha with five alternatives (omega total, omega RT, omega h, GLB, and coefficient H) in two simulation studies. Results showed for unidimensional scales, (a) all indices except omega h performed similarly well for most conditions; (b) alpha is still good; (c) GLB and coefficient H overestimated reliability with small…
Descriptors: Test Theory, Test Reliability, Factor Analysis, Test Length
Peer reviewed Peer reviewed
Direct linkDirect link
Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024
To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…
Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Sauder, Derek; DeMars, Christine – Applied Measurement in Education, 2020
We used simulation techniques to assess the item-level and familywise Type I error control and power of an IRT item-fit statistic, the "S-X"[superscript 2]. Previous research indicated that the "S-X"[superscript 2] has good Type I error control and decent power, but no previous research examined familywise Type I error control.…
Descriptors: Item Response Theory, Test Items, Sample Size, Test Length
Peer reviewed Peer reviewed
Direct linkDirect link
Lim, Euijin; Lee, Won-Chan – Applied Measurement in Education, 2020
The purpose of this study is to address the necessity of subscore equating and to evaluate the performance of various equating methods for subtests. Assuming the random groups design and number-correct scoring, this paper analyzed real data and simulated data with four study factors including test dimensionality, subtest length, form difference in…
Descriptors: Equated Scores, Test Length, Test Format, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Chon, Kyong Hee; Lee, Won-Chan; Ansley, Timothy N. – Applied Measurement in Education, 2013
Empirical information regarding performance of model-fit procedures has been a persistent need in measurement practice. Statistical procedures for evaluating item fit were applied to real test examples that consist of both dichotomously and polytomously scored items. The item fit statistics used in this study included the PARSCALE's G[squared],…
Descriptors: Test Format, Test Items, Item Analysis, Goodness of Fit
Peer reviewed Peer reviewed
Direct linkDirect link
Wells, Craig S.; Bolt, Daniel M. – Applied Measurement in Education, 2008
Tests of model misfit are often performed to validate the use of a particular model in item response theory. Douglas and Cohen (2001) introduced a general nonparametric approach for detecting misfit under the two-parameter logistic model. However, the statistical properties of their approach, and empirical comparisons to other methods, have not…
Descriptors: Test Length, Test Items, Monte Carlo Methods, Nonparametric Statistics
Peer reviewed Peer reviewed
De Champlain, Andre; Gessaroli, Marc E. – Applied Measurement in Education, 1998
Type I error rates and rejection rates for three-dimensionality assessment procedures were studied with data sets simulated to reflect short tests and small samples. Results show that the G-squared difference test (D. Bock, R. Gibbons, and E. Muraki, 1988) suffered from a severely inflated Type I error rate at all conditions simulated. (SLD)
Descriptors: Item Response Theory, Matrices, Sample Size, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Wollack, James A. – Applied Measurement in Education, 2006
Many of the currently available statistical indexes to detect answer copying lack sufficient power at small [alpha] levels or when the amount of copying is relatively small. Furthermore, there is no one index that is uniformly best. Depending on the type or amount of copying, certain indexes are better than others. The purpose of this article was…
Descriptors: Statistical Analysis, Item Analysis, Test Length, Sample Size
Peer reviewed Peer reviewed
Direct linkDirect link
Chuah, Siang Chee; Drasgow, Fritz; Luecht, Richard – Applied Measurement in Education, 2006
Adaptive tests offer the advantages of reduced test length and increased accuracy in ability estimation. However, adaptive tests require large pools of precalibrated items. This study looks at the development of an item pool for 1 type of adaptive administration: the computer-adaptive sequential test. An important issue is the sample size required…
Descriptors: Test Length, Sample Size, Adaptive Testing, Item Response Theory
Peer reviewed Peer reviewed
Fitzpatrick, Anne R.; Yen, Wendy M. – Applied Measurement in Education, 2001
Examined the effects of test length and sample size on the alternate forms reliability and equating of simulated mathematics tests composed of constructed response items scaled using the two-parameter partial credit model. Results suggest that, to obtain acceptable reliabilities and accurate equated scores, tests should have at least 8 6-point…
Descriptors: Constructed Response, Equated Scores, Mathematics Tests, Reliability
Peer reviewed Peer reviewed
Hambleton, Ronald K.; Jones, Russell W. – Applied Measurement in Education, 1994
The impact of capitalizing on chance in item selection on the accuracy of test information functions was studied through simulation, focusing on examinee sample size in item calibration and the ratio of item bank size to test length. (SLD)
Descriptors: Computer Simulation, Estimation (Mathematics), Item Banks, Item Response Theory