Publication Date
| In 2026 | 0 |
| Since 2025 | 91 |
| Since 2022 (last 5 years) | 684 |
| Since 2017 (last 10 years) | 1822 |
| Since 2007 (last 20 years) | 4031 |
Descriptor
| Item Response Theory | 5652 |
| Test Items | 1867 |
| Foreign Countries | 1230 |
| Models | 1168 |
| Psychometrics | 936 |
| Scores | 799 |
| Comparative Analysis | 769 |
| Test Construction | 765 |
| Simulation | 750 |
| Statistical Analysis | 663 |
| Difficulty Level | 584 |
| More ▼ | |
Source
Author
| Sinharay, Sandip | 48 |
| Wilson, Mark | 45 |
| Cohen, Allan S. | 43 |
| Meijer, Rob R. | 43 |
| Tindal, Gerald | 42 |
| Wang, Wen-Chung | 40 |
| Alonzo, Julie | 37 |
| Ferrando, Pere J. | 36 |
| Cai, Li | 35 |
| van der Linden, Wim J. | 35 |
| Glas, Cees A. W. | 34 |
| More ▼ | |
Publication Type
Education Level
Location
| Turkey | 96 |
| Australia | 90 |
| Germany | 80 |
| United States | 76 |
| Netherlands | 69 |
| Indonesia | 62 |
| Taiwan | 60 |
| China | 53 |
| Canada | 51 |
| Japan | 41 |
| Hong Kong | 39 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
Kelsey Nason; Christine DeMars – Journal of Educational Measurement, 2025
This study examined the widely used threshold of 0.2 for Yen's Q3, an index for violations of local independence. Specifically, a simulation was conducted to investigate whether Q3 values were related to the magnitude of bias in estimates of reliability, item parameters, and examinee ability. Results showed that Q3 values below the typical cut-off…
Descriptors: Item Response Theory, Statistical Bias, Test Reliability, Test Items
Hanke Vermeiren; Abe D. Hofman; Maria Bolsinova – International Educational Data Mining Society, 2025
The traditional Elo rating system (ERS), widely used as a student model in adaptive learning systems, assumes unidimensionality (i.e., all items measure a single ability or skill), limiting its ability to handle multidimensional data common in educational contexts. In response, several multidimensional extensions of the Elo rating system have been…
Descriptors: Item Response Theory, Models, Comparative Analysis, Algorithms
Joshua B. Gilbert; Zachary Himmelsbach; James Soland; Mridul Joshi; Benjamin W. Domingue – Journal of Policy Analysis and Management, 2025
Analyses of heterogeneous treatment effects (HTE) are common in applied causal inference research. However, when outcomes are latent variables assessed via psychometric instruments such as educational tests, standard methods ignore the potential HTE that may exist among the individual items of the outcome measure. Failing to account for…
Descriptors: Item Response Theory, Test Items, Error of Measurement, Scores
Jörg-Henrik Heine; Moritz Heene – Measurement: Interdisciplinary Research and Perspectives, 2025
This paper critically evaluates the quantification of psychological attributes through metric measurement. Drawing on epistemological considerations by Immanuel Kant, the development of measurement theory in the natural and social sciences is outlined. This includes an examination of Fechner's psychophysical law and the fundamental criticism…
Descriptors: Measurement, Scaling, Psychological Testing, Psychological Characteristics
Jianbin Fu; Xuan Tan; Patrick C. Kyllonen – Applied Measurement in Education, 2024
A process is proposed to create the one-dimensional expected item characteristic curve (ICC) and test characteristic curve (TCC) for each trait in multidimensional forced-choice questionnaires based on the Rank-2PL (two-parameter logistic) item response theory models for forced-choice items with two or three statements. Some examples of ICC and…
Descriptors: Item Response Theory, Questionnaires, Measurement Techniques, Statistics
Stefanie A. Wind; Benjamin Lugu; Yurou Wang – International Journal of Testing, 2025
Mokken Scale Analysis (MSA) is a nonparametric approach that offers exploratory tools for understanding the nature of item responses while emphasizing invariance requirements. MSA is often discussed as it relates to Rasch measurement theory, which also emphasizes invariance, but uses parametric models. Researchers who have compared and combined…
Descriptors: Item Response Theory, Scaling, Surveys, Evaluation Methods
Qi Huang; Daniel M. Bolt; Xiangyi Liao – Journal of Educational Measurement, 2025
Item response theory (IRT) encompasses a broader class of measurement models than is commonly appreciated by practitioners in educational measurement. For measures of vocabulary and its development, we show how psychological theory might in certain instances support unipolar IRT modeling as a superior alternative to the more traditional bipolar…
Descriptors: Educational Theories, Item Response Theory, Vocabulary Development, Models
Ken A. Fujimoto; Carl F. Falk – Educational and Psychological Measurement, 2024
Item response theory (IRT) models are often compared with respect to predictive performance to determine the dimensionality of rating scale data. However, such model comparisons could be biased toward nested-dimensionality IRT models (e.g., the bifactor model) when comparing those models with non-nested-dimensionality IRT models (e.g., a…
Descriptors: Item Response Theory, Rating Scales, Predictive Measurement, Bayesian Statistics
Junhuan Wei; Qin Wang; Buyun Dai; Yan Cai; Dongbo Tu – Journal of Educational Measurement, 2024
Traditional IRT and IRTree models are not appropriate for analyzing the item that simultaneously consists of multiple-choice (MC) task and constructed-response (CR) task in one item. To address this issue, this study proposed an item response tree model (called as IRTree-MR) to accommodate items that contain different response types at different…
Descriptors: Item Response Theory, Models, Multiple Choice Tests, Cognitive Processes
Jianbin Fu; TsungHan Ho; Xuan Tan – Practical Assessment, Research & Evaluation, 2025
Item parameter estimation using an item response theory (IRT) model with fixed ability estimates is useful in equating with small samples on anchor items. The current study explores the impact of three ability estimation methods (weighted likelihood estimation [WLE], maximum a posteriori [MAP], and posterior ability distribution estimation [PST])…
Descriptors: Item Response Theory, Test Items, Computation, Equated Scores
Bogdan Yamkovenko; Charlie A. R. Hogg; Maya Miller-Vedam; Phillip Grimaldi; Walt Wells – International Educational Data Mining Society, 2025
Knowledge tracing (KT) models predict how students will perform on future interactions, given a sequence of prior responses. Modern approaches to KT leverage "deep learning" techniques to produce more accurate predictions, potentially making personalized learning paths more efficacious for learners. Many papers on the topic of KT focus…
Descriptors: Algorithms, Artificial Intelligence, Models, Prediction
Wind, Stefanie A. – Educational and Psychological Measurement, 2023
Rating scale analysis techniques provide researchers with practical tools for examining the degree to which ordinal rating scales (e.g., Likert-type scales or performance assessment rating scales) function in psychometrically useful ways. When rating scales function as expected, researchers can interpret ratings in the intended direction (i.e.,…
Descriptors: Rating Scales, Testing Problems, Item Response Theory, Models
Zeyuan Jing – ProQuest LLC, 2023
This dissertation presents a comprehensive review of the evolution of DIF analysis within educational measurement from the 1980s to the present. The review elucidates the concept of DIF, particularly emphasizing the crucial role of grouping for exhibiting DIF. Then, the dissertation introduces an innovative modification to the newly developed…
Descriptors: Item Response Theory, Algorithms, Measurement, Test Bias
Yun-Kyung Kim; Li Cai – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2025
This paper introduces an application of cross-classified item response theory (IRT) modeling to an assessment utilizing the embedded standard setting (ESS) method (Lewis & Cook). The cross-classified IRT model is used to treat both item and person effects as random, where the item effects are regressed on the target performance levels (target…
Descriptors: Standard Setting (Scoring), Item Response Theory, Test Items, Difficulty Level
Ye Ma; Deborah J. Harris – Educational Measurement: Issues and Practice, 2025
Item position effect (IPE) refers to situations where an item performs differently when it is administered in different positions on a test. The majority of previous research studies have focused on investigating IPE under linear testing. There is a lack of IPE research under adaptive testing. In addition, the existence of IPE might violate Item…
Descriptors: Computer Assisted Testing, Adaptive Testing, Item Response Theory, Test Items

Peer reviewed
Direct link
