Publication Date
In 2025 | 2 |
Since 2024 | 20 |
Since 2021 (last 5 years) | 35 |
Since 2016 (last 10 years) | 79 |
Since 2006 (last 20 years) | 148 |
Descriptor
Models | 192 |
Item Response Theory | 112 |
Mathematical Models | 111 |
Test Items | 83 |
Simulation | 53 |
Scores | 43 |
Comparative Analysis | 39 |
Statistical Analysis | 38 |
Error of Measurement | 36 |
Test Construction | 33 |
Goodness of Fit | 30 |
More ▼ |
Source
Journal of Educational… | 309 |
Author
Wang, Wen-Chung | 9 |
Mislevy, Robert J. | 6 |
de la Torre, Jimmy | 6 |
Kolen, Michael J. | 5 |
Nandakumar, Ratna | 5 |
Hambleton, Ronald K. | 4 |
Jin, Kuan-Yu | 4 |
Kane, Michael T. | 4 |
Novick, Melvin R. | 4 |
Wainer, Howard | 4 |
Bolt, Daniel M. | 3 |
More ▼ |
Publication Type
Journal Articles | 272 |
Reports - Research | 164 |
Reports - Evaluative | 81 |
Reports - Descriptive | 25 |
Speeches/Meeting Papers | 6 |
Information Analyses | 3 |
Book/Product Reviews | 2 |
Guides - Non-Classroom | 1 |
Education Level
Secondary Education | 14 |
Higher Education | 7 |
Postsecondary Education | 7 |
Middle Schools | 3 |
Elementary Secondary Education | 2 |
High Schools | 2 |
Junior High Schools | 2 |
Elementary Education | 1 |
Grade 10 | 1 |
Grade 8 | 1 |
Grade 9 | 1 |
More ▼ |
Audience
Researchers | 4 |
Location
Hong Kong | 2 |
Belgium | 1 |
China | 1 |
Colombia | 1 |
Netherlands | 1 |
New Jersey | 1 |
South Carolina | 1 |
Turkey | 1 |
United Kingdom (England) | 1 |
United Kingdom (Scotland) | 1 |
Laws, Policies, & Programs
Defunis v Odegaard | 1 |
Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Xiangyi Liao; Daniel M. Bolt; Jee-Seon Kim – Journal of Educational Measurement, 2024
Item difficulty and dimensionality often correlate, implying that unidimensional IRT approximations to multidimensional data (i.e., reference composites) can take a curvilinear form in the multidimensional space. Although this issue has been previously discussed in the context of vertical scaling applications, we illustrate how such a phenomenon…
Descriptors: Difficulty Level, Simulation, Multidimensional Scaling, Graphs
Junhuan Wei; Qin Wang; Buyun Dai; Yan Cai; Dongbo Tu – Journal of Educational Measurement, 2024
Traditional IRT and IRTree models are not appropriate for analyzing the item that simultaneously consists of multiple-choice (MC) task and constructed-response (CR) task in one item. To address this issue, this study proposed an item response tree model (called as IRTree-MR) to accommodate items that contain different response types at different…
Descriptors: Item Response Theory, Models, Multiple Choice Tests, Cognitive Processes
Becker, Benjamin; Weirich, Sebastian; Goldhammer, Frank; Debeer, Dries – Journal of Educational Measurement, 2023
When designing or modifying a test, an important challenge is controlling its speededness. To achieve this, van der Linden (2011a, 2011b) proposed using a lognormal response time model, more specifically the two-parameter lognormal model, and automated test assembly (ATA) via mixed integer linear programming. However, this approach has a severe…
Descriptors: Test Construction, Automation, Models, Test Items
Tae Yeon Kwon; A. Corinne Huggins-Manley; Jonathan Templin; Mingying Zheng – Journal of Educational Measurement, 2024
In classroom assessments, examinees can often answer test items multiple times, resulting in sequential multiple-attempt data. Sequential diagnostic classification models (DCMs) have been developed for such data. As student learning processes may be aligned with a hierarchy of measured traits, this study aimed to develop a sequential hierarchical…
Descriptors: Classification, Accuracy, Student Evaluation, Sequential Approach
Jihong Zhang; Jonathan Templin; Xinya Liang – Journal of Educational Measurement, 2024
Recently, Bayesian diagnostic classification modeling has been becoming popular in health psychology, education, and sociology. Typically information criteria are used for model selection when researchers want to choose the best model among alternative models. In Bayesian estimation, posterior predictive checking is a flexible Bayesian model…
Descriptors: Bayesian Statistics, Cognitive Measurement, Models, Classification
Kim, Rae Yeong; Yoo, Yun Joo – Journal of Educational Measurement, 2023
In cognitive diagnostic models (CDMs), a set of fine-grained attributes is required to characterize complex problem solving and provide detailed diagnostic information about an examinee. However, it is challenging to ensure reliable estimation and control computational complexity when The test aims to identify the examinee's attribute profile in a…
Descriptors: Models, Diagnostic Tests, Adaptive Testing, Accuracy
Kasli, Murat; Zopluoglu, Cengiz; Toton, Sarah L. – Journal of Educational Measurement, 2023
Response times (RTs) have recently attracted a significant amount of attention in the literature as they may provide meaningful information about item preknowledge. In this study, a new model, the Deterministic Gated Lognormal Response Time (DG-LNRT) model, is proposed to identify examinees with item preknowledge using RTs. The proposed model was…
Descriptors: Reaction Time, Test Items, Models, Familiarity
Jianbin Fu; Xuan Tan; Patrick C. Kyllonen – Journal of Educational Measurement, 2024
This paper presents the item and test information functions of the Rank two-parameter logistic models (Rank-2PLM) for items with two (pair) and three (triplet) statements in forced-choice questionnaires. The Rank-2PLM model for pairs is the MUPP-2PLM (Multi-Unidimensional Pairwise Preference) and, for triplets, is the Triplet-2PLM. Fisher's…
Descriptors: Questionnaires, Test Items, Item Response Theory, Models
Sooyong Lee; Suhwa Han; Seung W. Choi – Journal of Educational Measurement, 2024
Research has shown that multiple-indicator multiple-cause (MIMIC) models can result in inflated Type I error rates in detecting differential item functioning (DIF) when the assumption of equal latent variance is violated. This study explains how the violation of the equal variance assumption adversely impacts the detection of nonuniform DIF and…
Descriptors: Factor Analysis, Bayesian Statistics, Test Bias, Item Response Theory
Wenchao Ma; Miguel A. Sorrel; Xiaoming Zhai; Yuan Ge – Journal of Educational Measurement, 2024
Most existing diagnostic models are developed to detect whether students have mastered a set of skills of interest, but few have focused on identifying what scientific misconceptions students possess. This article developed a general dual-purpose model for simultaneously estimating students' overall ability and the presence and absence of…
Descriptors: Models, Misconceptions, Diagnostic Tests, Ability
Gorney, Kylie; Wollack, James A. – Journal of Educational Measurement, 2022
Detection methods for item preknowledge are often evaluated in simulation studies where models are used to generate the data. To ensure the reliability of such methods, it is crucial that these models are able to accurately represent situations that are encountered in practice. The purpose of this article is to provide a critical analysis of…
Descriptors: Prior Learning, Simulation, Models, Reaction Time
Gorney, Kylie; Wollack, James A. – Journal of Educational Measurement, 2023
In order to detect a wide range of aberrant behaviors, it can be useful to incorporate information beyond the dichotomous item scores. In this paper, we extend the l[subscript z] and l*[subscript z] person-fit statistics so that unusual behavior in item scores and unusual behavior in item distractors can be used as indicators of aberrance. Through…
Descriptors: Test Items, Scores, Goodness of Fit, Statistics
Kuan-Yu Jin; Wai-Lok Siu – Journal of Educational Measurement, 2025
Educational tests often have a cluster of items linked by a common stimulus ("testlet"). In such a design, the dependencies caused between items are called "testlet effects." In particular, the directional testlet effect (DTE) refers to a recursive influence whereby responses to earlier items can positively or negatively affect…
Descriptors: Models, Test Items, Educational Assessment, Scores
Gregory M. Hurtz; Regi Mucino – Journal of Educational Measurement, 2024
The Lognormal Response Time (LNRT) model measures the speed of test-takers relative to the normative time demands of items on a test. The resulting speed parameters and model residuals are often analyzed for evidence of anomalous test-taking behavior associated with fast and poorly fitting response time patterns. Extending this model, we…
Descriptors: Student Reaction, Reaction Time, Response Style (Tests), Test Items
Jia Liu; Xiangbin Meng; Gongjun Xu; Wei Gao; Ningzhong Shi – Journal of Educational Measurement, 2024
In this paper, we develop a mixed stochastic approximation expectation-maximization (MSAEM) algorithm coupled with a Gibbs sampler to compute the marginalized maximum a posteriori estimate (MMAPE) of a confirmatory multidimensional four-parameter normal ogive (M4PNO) model. The proposed MSAEM algorithm not only has the computational advantages of…
Descriptors: Algorithms, Achievement Tests, Foreign Countries, International Assessment