Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 8 |
Since 2016 (last 10 years) | 15 |
Since 2006 (last 20 years) | 35 |
Descriptor
Sample Size | 44 |
Simulation | 44 |
Test Length | 44 |
Item Response Theory | 33 |
Test Items | 20 |
Comparative Analysis | 10 |
Correlation | 10 |
Models | 10 |
Goodness of Fit | 9 |
Ability | 7 |
Error of Measurement | 7 |
More ▼ |
Source
Author
Publication Type
Journal Articles | 30 |
Reports - Research | 27 |
Reports - Evaluative | 10 |
Dissertations/Theses -… | 7 |
Speeches/Meeting Papers | 7 |
Education Level
Higher Education | 2 |
Postsecondary Education | 2 |
Audience
Location
Turkey | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Law School Admission Test | 3 |
SAT (College Admission Test) | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Novak, Josip; Rebernjak, Blaž – Measurement: Interdisciplinary Research and Perspectives, 2023
A Monte Carlo simulation study was conducted to examine the performance of [alpha], [lambda]2, [lambda][subscript 4], [lambda][subscript 2], [omega][subscript T], GLB[subscript MRFA], and GLB[subscript Algebraic] coefficients. Population reliability, distribution shape, sample size, test length, and number of response categories were varied…
Descriptors: Monte Carlo Methods, Evaluation Methods, Reliability, Simulation
Xiao, Leifeng; Hau, Kit-Tai – Applied Measurement in Education, 2023
We compared coefficient alpha with five alternatives (omega total, omega RT, omega h, GLB, and coefficient H) in two simulation studies. Results showed for unidimensional scales, (a) all indices except omega h performed similarly well for most conditions; (b) alpha is still good; (c) GLB and coefficient H overestimated reliability with small…
Descriptors: Test Theory, Test Reliability, Factor Analysis, Test Length
Chenchen Ma; Jing Ouyang; Chun Wang; Gongjun Xu – Grantee Submission, 2024
Survey instruments and assessments are frequently used in many domains of social science. When the constructs that these assessments try to measure become multifaceted, multidimensional item response theory (MIRT) provides a unified framework and convenient statistical tool for item analysis, calibration, and scoring. However, the computational…
Descriptors: Algorithms, Item Response Theory, Scoring, Accuracy
Paek, Insu; Lin, Zhongtian; Chalmers, Robert Philip – Educational and Psychological Measurement, 2023
To reduce the chance of Heywood cases or nonconvergence in estimating the 2PL or the 3PL model in the marginal maximum likelihood with the expectation-maximization (MML-EM) estimation method, priors for the item slope parameter in the 2PL model or for the pseudo-guessing parameter in the 3PL model can be used and the marginal maximum a posteriori…
Descriptors: Models, Item Response Theory, Test Items, Intervals
Fu, Yanyan; Strachan, Tyler; Ip, Edward H.; Willse, John T.; Chen, Shyh-Huei; Ackerman, Terry – International Journal of Testing, 2020
This research examined correlation estimates between latent abilities when using the two-dimensional and three-dimensional compensatory and noncompensatory item response theory models. Simulation study results showed that the recovery of the latent correlation was best when the test contained 100% of simple structure items for all models and…
Descriptors: Item Response Theory, Models, Test Items, Simulation
Derek Sauder – ProQuest LLC, 2020
The Rasch model is commonly used to calibrate multiple choice items. However, the sample sizes needed to estimate the Rasch model can be difficult to attain (e.g., consider a small testing company trying to pretest new items). With small sample sizes, auxiliary information besides the item responses may improve estimation of the item parameters.…
Descriptors: Item Response Theory, Sample Size, Computation, Test Length
Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023
Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…
Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level
Kiliç, Abdullah Faruk; Uysal, Ibrahim – Turkish Journal of Education, 2019
In this study, the purpose is to compare factor retention methods under simulation conditions. For this purpose, simulations conditions with a number of factors (1, 2 [simple]), sample sizes (250, 1.000, and 3.000), number of items (20, 30), average factor loading (0.50, 0.70), and correlation matrix (Pearson Product Moment [PPM] and Tetrachoric)…
Descriptors: Simulation, Factor Structure, Sample Size, Test Length
Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022
Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…
Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations
Ziying Li; A. Corinne Huggins-Manley; Walter L. Leite; M. David Miller; Eric A. Wright – Educational and Psychological Measurement, 2022
The unstructured multiple-attempt (MA) item response data in virtual learning environments (VLEs) are often from student-selected assessment data sets, which include missing data, single-attempt responses, multiple-attempt responses, and unknown growth ability across attempts, leading to a complex and complicated scenario for using this kind of…
Descriptors: Sequential Approach, Item Response Theory, Data, Simulation
Köse, Alper; Dogan, C. Deha – International Journal of Evaluation and Research in Education, 2019
The aim of this study was to examine the precision of item parameter estimation in different sample sizes and test lengths under three parameter logistic model (3PL) item response theory (IRT) model, where the trait measured by a test was not normally distributed or had a skewed distribution. In the study, number of categories (1-0), and item…
Descriptors: Statistical Bias, Item Response Theory, Simulation, Accuracy
Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021
Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…
Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis
Ames, Allison J.; Leventhal, Brian C.; Ezike, Nnamdi C. – Measurement: Interdisciplinary Research and Perspectives, 2020
Data simulation and Monte Carlo simulation studies are important skills for researchers and practitioners of educational and psychological measurement, but there are few resources on the topic specific to item response theory. Even fewer resources exist on the statistical software techniques to implement simulation studies. This article presents…
Descriptors: Monte Carlo Methods, Item Response Theory, Simulation, Computer Software
Andersson, Björn – Journal of Educational Measurement, 2016
In observed-score equipercentile equating, the goal is to make scores on two scales or tests measuring the same construct comparable by matching the percentiles of the respective score distributions. If the tests consist of different items with multiple categories for each item, a suitable model for the responses is a polytomous item response…
Descriptors: Equated Scores, Item Response Theory, Error of Measurement, Tests
Tay, Louis; Huang, Qiming; Vermunt, Jeroen K. – Educational and Psychological Measurement, 2016
In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To…
Descriptors: Item Response Theory, Test Bias, Simulation, College Entrance Examinations