Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 8 |
Since 2006 (last 20 years) | 22 |
Descriptor
Item Response Theory | 27 |
Sample Size | 27 |
Sampling | 27 |
Simulation | 10 |
Test Items | 10 |
Error of Measurement | 9 |
Test Length | 7 |
Comparative Analysis | 6 |
Computation | 5 |
Data Analysis | 5 |
Difficulty Level | 5 |
More ▼ |
Source
Author
Abad, Francisco J. | 1 |
Arikan, Çigdem Akin | 1 |
Bergstrom, Betty A. | 1 |
Blaker, Lisa | 1 |
Burns, Laura J. | 1 |
Bush, M. Joan | 1 |
Chen, Hanwei | 1 |
Cohen, Allan S. | 1 |
Cui, Zhongmin | 1 |
Dever, Jill A. | 1 |
Diao, Hongyu | 1 |
More ▼ |
Publication Type
Education Level
Secondary Education | 3 |
Elementary Secondary Education | 2 |
Grade 8 | 2 |
Higher Education | 2 |
Junior High Schools | 2 |
Middle Schools | 2 |
Postsecondary Education | 2 |
Elementary Education | 1 |
High Schools | 1 |
Kindergarten | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Early Childhood Longitudinal… | 1 |
SAT (College Admission Test) | 1 |
Student Teacher Relationship… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Paek, Insu; Liang, Xinya; Lin, Zhongtian – Measurement: Interdisciplinary Research and Perspectives, 2021
The property of item parameter invariance in item response theory (IRT) plays a pivotal role in the applications of IRT such as test equating. The scope of parameter invariance when using estimates from finite biased samples in the applications of IRT does not appear to be clearly documented in the IRT literature. This article provides information…
Descriptors: Item Response Theory, Computation, Test Items, Bias
Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021
Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…
Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis
Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020
Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…
Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling
Diao, Hongyu; Keller, Lisa – Applied Measurement in Education, 2020
Examinees who attempt the same test multiple times are often referred to as "repeaters." Previous studies suggested that repeaters should be excluded from the total sample before equating because repeater groups are distinguishable from non-repeater groups. In addition, repeaters might memorize anchor items, causing item drift under a…
Descriptors: Licensing Examinations (Professions), College Entrance Examinations, Repetition, Testing Problems
Kim, Seohyun; Lu, Zhenqiu; Cohen, Allan S. – Measurement: Interdisciplinary Research and Perspectives, 2018
Bayesian algorithms have been used successfully in the social and behavioral sciences to analyze dichotomous data particularly with complex structural equation models. In this study, we investigate the use of the Polya-Gamma data augmentation method with Gibbs sampling to improve estimation of structural equation models with dichotomous variables.…
Descriptors: Bayesian Statistics, Structural Equation Models, Computation, Social Science Research
Soysal, Sümeyra; Arikan, Çigdem Akin; Inal, Hatice – Online Submission, 2016
This study aims to investigate the effect of methods to deal with missing data on item difficulty estimations under different test length conditions and sampling sizes. In this line, a data set including 10, 20 and 40 items with 100 and 5000 sampling size was prepared. Deletion process was applied at the rates of 5%, 10% and 20% under conditions…
Descriptors: Research Problems, Data Analysis, Item Response Theory, Test Items
Oshima, T. C.; Wright, Keith; White, Nick – International Journal of Testing, 2015
Raju, van der Linden, and Fleer (1995) introduced a framework for differential functioning of items and tests (DFIT) for unidimensional dichotomous models. Since then, DFIT has been shown to be a quite versatile framework as it can handle polytomous as well as multidimensional models both at the item and test levels. However, DFIT is still limited…
Descriptors: Test Bias, Item Response Theory, Test Items, Simulation
Wu, Yi-Fang – ProQuest LLC, 2015
Item response theory (IRT) uses a family of statistical models for estimating stable characteristics of items and examinees and defining how these characteristics interact in describing item and test performance. With a focus on the three-parameter logistic IRT (Birnbaum, 1968; Lord, 1980) model, the current study examines the accuracy and…
Descriptors: Item Response Theory, Test Items, Accuracy, Computation
Romero, Sonia J.; Ordoñez, Xavier G.; Ponsoda, Vincente; Revuelta, Javier – Psicologica: International Journal of Methodology and Experimental Psychology, 2014
Cognitive Diagnostic Models (CDMs) aim to provide information about the degree to which individuals have mastered specific attributes that underlie the success of these individuals on test items. The Q-matrix is a key element in the application of CDMs, because contains links item-attributes representing the cognitive structure proposed for solve…
Descriptors: Evaluation Methods, Q Methodology, Matrices, Sampling
Straat, J. Hendrik; van der Ark, L. Andries; Sijtsma, Klaas – Educational and Psychological Measurement, 2014
An automated item selection procedure in Mokken scale analysis partitions a set of items into one or more Mokken scales, if the data allow. Two algorithms are available that pursue the same goal of selecting Mokken scales of maximum length: Mokken's original automated item selection procedure (AISP) and a genetic algorithm (GA). Minimum…
Descriptors: Sampling, Test Items, Effect Size, Scaling
Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014
The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…
Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference
Topczewski, Anna; Cui, Zhongmin; Woodruff, David; Chen, Hanwei; Fang, Yu – ACT, Inc., 2013
This paper investigates four methods of linear equating under the common item nonequivalent groups design. Three of the methods are well known: Tucker, Angoff-Levine, and Congeneric-Levine. A fourth method is presented as a variant of the Congeneric-Levine method. Using simulation data generated from the three-parameter logistic IRT model we…
Descriptors: Comparative Analysis, Equated Scores, Methods, Simulation
Skaggs, Gary; Wilkins, Jesse L. M.; Hein, Serge F. – International Journal of Testing, 2016
The purpose of this study was to explore the degree of grain size of the attributes and the sample sizes that can support accurate parameter recovery with the General Diagnostic Model (GDM) for a large-scale international assessment. In this resampling study, bootstrap samples were obtained from the 2003 Grade 8 TIMSS in Mathematics at varying…
Descriptors: Achievement Tests, Foreign Countries, Elementary Secondary Education, Science Achievement
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Sunnassee, Devdass – ProQuest LLC, 2011
Small sample equating remains a largely unexplored area of research. This study attempts to fill in some of the research gaps via a large-scale, IRT-based simulation study that evaluates the performance of seven small-sample equating methods under various test characteristic and sampling conditions. The equating methods considered are typically…
Descriptors: Test Length, Test Format, Sample Size, Simulation
Previous Page | Next Page »
Pages: 1 | 2