Publication Date
In 2025 | 1 |
Since 2024 | 5 |
Since 2021 (last 5 years) | 14 |
Since 2016 (last 10 years) | 32 |
Since 2006 (last 20 years) | 74 |
Descriptor
Item Response Theory | 122 |
Sampling | 122 |
Test Items | 40 |
Computation | 27 |
Sample Size | 27 |
Simulation | 26 |
Equated Scores | 25 |
Error of Measurement | 23 |
Evaluation Methods | 21 |
Comparative Analysis | 20 |
Monte Carlo Methods | 20 |
More ▼ |
Source
Author
Dorans, Neil J. | 6 |
Chang, Hua-Hua | 3 |
Meijer, Rob R. | 3 |
von Davier, Alina A. | 3 |
Baker, Frank B. | 2 |
Blais, Jean-Guy | 2 |
Chun Wang | 2 |
Donoghue, John R. | 2 |
Eignor, Daniel R. | 2 |
Gongjun Xu | 2 |
Hambleton, Ronald K. | 2 |
More ▼ |
Publication Type
Education Level
Higher Education | 9 |
Secondary Education | 9 |
Elementary Secondary Education | 7 |
Elementary Education | 5 |
Grade 8 | 4 |
Junior High Schools | 4 |
Middle Schools | 4 |
Postsecondary Education | 4 |
High Schools | 2 |
Grade 4 | 1 |
Grade 6 | 1 |
More ▼ |
Audience
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Jiaying Xiao – ProQuest LLC, 2024
Multidimensional Item Response Theory (MIRT) has been widely used in educational and psychological assessments. It estimates multiple constructs simultaneously and models the correlations among latent constructs. While it provides more accurate results, the unidimensional IRT model is still dominant in real applications. One major reason is that…
Descriptors: Item Response Theory, Algorithms, Computation, Efficiency
Jiaying Xiao; Chun Wang; Gongjun Xu – Grantee Submission, 2024
Accurate item parameters and standard errors (SEs) are crucial for many multidimensional item response theory (MIRT) applications. A recent study proposed the Gaussian Variational Expectation Maximization (GVEM) algorithm to improve computational efficiency and estimation accuracy (Cho et al., 2021). However, the SE estimation procedure has yet to…
Descriptors: Error of Measurement, Models, Evaluation Methods, Item Analysis
Paganin, Sally; Paciorek, Christopher J.; Wehrhahn, Claudia; Rodríguez, Abel; Rabe-Hesketh, Sophia; de Valpine, Perry – Journal of Educational and Behavioral Statistics, 2023
Item response theory (IRT) models typically rely on a normality assumption for subject-specific latent traits, which is often unrealistic in practice. Semiparametric extensions based on Dirichlet process mixtures (DPMs) offer a more flexible representation of the unknown distribution of the latent trait. However, the use of such models in the IRT…
Descriptors: Bayesian Statistics, Item Response Theory, Guidance, Evaluation Methods
Yunting Liu; Shreya Bhandari; Zachary A. Pardos – British Journal of Educational Technology, 2025
Effective educational measurement relies heavily on the curation of well-designed item pools. However, item calibration is time consuming and costly, requiring a sufficient number of respondents to estimate the psychometric properties of items. In this study, we explore the potential of six different large language models (LLMs; GPT-3.5, GPT-4,…
Descriptors: Artificial Intelligence, Test Items, Psychometrics, Educational Assessment
Combs, Adam – Journal of Educational Measurement, 2023
A common method of checking person-fit in Bayesian item response theory (IRT) is the posterior-predictive (PP) method. In recent years, more powerful approaches have been proposed that are based on resampling methods using the popular L*[subscript z] statistic. There has also been proposed a new Bayesian model checking method based on pivotal…
Descriptors: Bayesian Statistics, Goodness of Fit, Evaluation Methods, Monte Carlo Methods
Peabody, Michael R.; Muckle, Timothy J.; Meng, Yu – Educational Measurement: Issues and Practice, 2023
The subjective aspect of standard-setting is often criticized, yet data-driven standard-setting methods are rarely applied. Therefore, we applied a mixture Rasch model approach to setting performance standards across several testing programs of various sizes and compared the results to existing passing standards derived from traditional…
Descriptors: Item Response Theory, Standard Setting, Testing, Sampling
Weicong Lyu; Chun Wang; Gongjun Xu – Grantee Submission, 2024
Data harmonization is an emerging approach to strategically combining data from multiple independent studies, enabling addressing new research questions that are not answerable by a single contributing study. A fundamental psychometric challenge for data harmonization is to create commensurate measures for the constructs of interest across…
Descriptors: Data Analysis, Test Items, Psychometrics, Item Response Theory
Paek, Insu; Liang, Xinya; Lin, Zhongtian – Measurement: Interdisciplinary Research and Perspectives, 2021
The property of item parameter invariance in item response theory (IRT) plays a pivotal role in the applications of IRT such as test equating. The scope of parameter invariance when using estimates from finite biased samples in the applications of IRT does not appear to be clearly documented in the IRT literature. This article provides information…
Descriptors: Item Response Theory, Computation, Test Items, Bias
Donoghue, John R.; McClellan, Catherine A.; Hess, Melinda R. – ETS Research Report Series, 2022
When constructed-response items are administered for a second time, it is necessary to evaluate whether the current Time B administration's raters have drifted from the scoring of the original administration at Time A. To study this, Time A papers are sampled and rescored by Time B scorers. Commonly the scores are compared using the proportion of…
Descriptors: Item Response Theory, Test Construction, Scoring, Testing
Cornesse, Carina; Blom, Annelies G. – Sociological Methods & Research, 2023
Recent years have seen a growing number of studies investigating the accuracy of nonprobability online panels; however, response quality in nonprobability online panels has not yet received much attention. To fill this gap, we investigate response quality in a comprehensive study of seven nonprobability online panels and three probability-based…
Descriptors: Probability, Sampling, Social Science Research, Research Methodology
Joo, Sean; Ali, Usama; Robin, Frederic; Shin, Hyo Jeong – Large-scale Assessments in Education, 2022
We investigated the potential impact of differential item functioning (DIF) on group-level mean and standard deviation estimates using empirical and simulated data in the context of large-scale assessment. For the empirical investigation, PISA 2018 cognitive domains (Reading, Mathematics, and Science) data were analyzed using Jackknife sampling to…
Descriptors: Test Items, Item Response Theory, Scores, Student Evaluation
Köhler, Carmen; Robitzsch, Alexander; Hartig, Johannes – Journal of Educational and Behavioral Statistics, 2020
Testing whether items fit the assumptions of an item response theory model is an important step in evaluating a test. In the literature, numerous item fit statistics exist, many of which show severe limitations. The current study investigates the root mean squared deviation (RMSD) item fit statistic, which is used for evaluating item fit in…
Descriptors: Test Items, Goodness of Fit, Statistics, Bias
Kang, Hyeon-Ah; Zheng, Yi; Chang, Hua-Hua – Journal of Educational and Behavioral Statistics, 2020
With the widespread use of computers in modern assessment, online calibration has become increasingly popular as a way of replenishing an item pool. The present study discusses online calibration strategies for a joint model of responses and response times. The study proposes likelihood inference methods for item paramter estimation and evaluates…
Descriptors: Adaptive Testing, Computer Assisted Testing, Item Response Theory, Reaction Time
Burns, Erin; Zolnik, Edmund; Yanez, Christina; Mann, Rebecca – National Center for Education Statistics, 2022
The NCVS is the nation's primary source of information on the nature of criminal victimization. The NCVS collects data each year from a nationally representative sample of households on the frequency, characteristics, and consequences of criminal victimization in the United States. Currently, the NCVS includes four supplemental surveys that are…
Descriptors: National Surveys, Bullying, Crime, Victims of Crime
van der Linden, Wim J.; Ren, Hao – Journal of Educational and Behavioral Statistics, 2020
The Bayesian way of accounting for the effects of error in the ability and item parameters in adaptive testing is through the joint posterior distribution of all parameters. An optimized Markov chain Monte Carlo algorithm for adaptive testing is presented, which samples this distribution in real time to score the examinee's ability and optimally…
Descriptors: Bayesian Statistics, Adaptive Testing, Error of Measurement, Markov Processes