Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 6 |
Descriptor
Test Items | 9 |
Test Length | 9 |
Item Response Theory | 6 |
Sample Size | 5 |
Error of Measurement | 4 |
Goodness of Fit | 3 |
Test Format | 3 |
Estimation (Mathematics) | 2 |
Item Analysis | 2 |
Item Banks | 2 |
Monte Carlo Methods | 2 |
More ▼ |
Source
Applied Measurement in… | 9 |
Author
Ansley, Timothy N. | 1 |
Bergstrom, Betty A. | 1 |
Bolt, Daniel M. | 1 |
Chon, Kyong Hee | 1 |
DeMars, Christine | 1 |
Hambleton, Ronald K. | 1 |
Jones, Russell W. | 1 |
Kannan, Priya | 1 |
Katz, Irvin R. | 1 |
Lee, HyeSun | 1 |
Lee, Won-Chan | 1 |
More ▼ |
Publication Type
Journal Articles | 9 |
Reports - Research | 8 |
Reports - Evaluative | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Elementary Secondary Education | 1 |
Audience
Location
Iran | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Iowa Tests of Basic Skills | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024
To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…
Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement
Sauder, Derek; DeMars, Christine – Applied Measurement in Education, 2020
We used simulation techniques to assess the item-level and familywise Type I error control and power of an IRT item-fit statistic, the "S-X"[superscript 2]. Previous research indicated that the "S-X"[superscript 2] has good Type I error control and decent power, but no previous research examined familywise Type I error control.…
Descriptors: Item Response Theory, Test Items, Sample Size, Test Length
Lee, HyeSun – Applied Measurement in Education, 2018
The current simulation study examined the effects of Item Parameter Drift (IPD) occurring in a short scale on parameter estimates in multilevel models where scores from a scale were employed as a time-varying predictor to account for outcome scores. Five factors, including three decisions about IPD, were considered for simulation conditions. It…
Descriptors: Test Items, Hierarchical Linear Modeling, Predictor Variables, Scores
Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015
The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…
Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items
Chon, Kyong Hee; Lee, Won-Chan; Ansley, Timothy N. – Applied Measurement in Education, 2013
Empirical information regarding performance of model-fit procedures has been a persistent need in measurement practice. Statistical procedures for evaluating item fit were applied to real test examples that consist of both dichotomously and polytomously scored items. The item fit statistics used in this study included the PARSCALE's G[squared],…
Descriptors: Test Format, Test Items, Item Analysis, Goodness of Fit
Wells, Craig S.; Bolt, Daniel M. – Applied Measurement in Education, 2008
Tests of model misfit are often performed to validate the use of a particular model in item response theory. Douglas and Cohen (2001) introduced a general nonparametric approach for detecting misfit under the two-parameter logistic model. However, the statistical properties of their approach, and empirical comparisons to other methods, have not…
Descriptors: Test Length, Test Items, Monte Carlo Methods, Nonparametric Statistics

Qualls, Audrey L. – Applied Measurement in Education, 1995
Classically parallel, tau-equivalently parallel, and congenerically parallel models representing various degrees of part-test parallelism and their appropriateness for tests composed of multiple item formats are discussed. An appropriate reliability estimate for a test with multiple item formats is presented and illustrated. (SLD)
Descriptors: Achievement Tests, Estimation (Mathematics), Measurement Techniques, Test Format

Hambleton, Ronald K.; Jones, Russell W. – Applied Measurement in Education, 1994
The impact of capitalizing on chance in item selection on the accuracy of test information functions was studied through simulation, focusing on examinee sample size in item calibration and the ratio of item bank size to test length. (SLD)
Descriptors: Computer Simulation, Estimation (Mathematics), Item Banks, Item Response Theory

Bergstrom, Betty A.; And Others – Applied Measurement in Education, 1992
Effects of altering test difficulty on examinee ability measures and test length in a computer adaptive test were studied for 225 medical technology students in 3 test difficulty conditions. Results suggest that, with an item pool of sufficient depth and breadth, acceptable targeting to test difficulty is possible. (SLD)
Descriptors: Ability, Adaptive Testing, Change, College Students