Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 9 |
Descriptor
Source
Journal of Educational… | 3 |
Applied Measurement in… | 1 |
Grantee Submission | 1 |
Journal of Educational and… | 1 |
Large-scale Assessments in… | 1 |
ProQuest LLC | 1 |
Sociological Methods &… | 1 |
Author
Rutkowski, Leslie | 3 |
Abulela, Mohammed A. A. | 1 |
Bolsinova, Maria | 1 |
Chengyu Cui | 1 |
Chun Wang | 1 |
Gongjun Xu | 1 |
Grund, Simon | 1 |
Haag, Nicole | 1 |
Liaw, Yuan-Ling | 1 |
Lüdtke, Oliver | 1 |
Pokropek, Artur | 1 |
More ▼ |
Publication Type
Journal Articles | 7 |
Reports - Research | 6 |
Reports - Descriptive | 2 |
Dissertations/Theses -… | 1 |
Education Level
Elementary Secondary Education | 5 |
Secondary Education | 4 |
Elementary Education | 1 |
Grade 4 | 1 |
Intermediate Grades | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Trends in International… | 5 |
Program for International… | 4 |
Advanced Placement… | 1 |
Big Five Inventory | 1 |
National Assessment of… | 1 |
Progress in International… | 1 |
What Works Clearinghouse Rating
Chengyu Cui; Chun Wang; Gongjun Xu – Grantee Submission, 2024
Multidimensional item response theory (MIRT) models have generated increasing interest in the psychometrics literature. Efficient approaches for estimating MIRT models with dichotomous responses have been developed, but constructing an equally efficient and robust algorithm for polytomous models has received limited attention. To address this gap,…
Descriptors: Item Response Theory, Accuracy, Simulation, Psychometrics
Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022
When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…
Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis
Tijmstra, Jesper; Bolsinova, Maria; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2020
Although the root-mean squared deviation (RMSD) is a popular statistical measure for evaluating country-specific item-level misfit (i.e., differential item functioning [DIF]) in international large-scale assessment, this paper shows that its sensitivity to detect misfit may depend strongly on the proficiency distribution of the considered…
Descriptors: Test Items, Goodness of Fit, Probability, Accuracy
Grund, Simon; Lüdtke, Oliver; Robitzsch, Alexander – Journal of Educational and Behavioral Statistics, 2021
Large-scale assessments (LSAs) use Mislevy's "plausible value" (PV) approach to relate student proficiency to noncognitive variables administered in a background questionnaire. This method requires background variables to be completely observed, a requirement that is seldom fulfilled. In this article, we evaluate and compare the…
Descriptors: Data Analysis, Error of Measurement, Research Problems, Statistical Inference
Rutkowski, Leslie; Zhou, Yan – Journal of Educational Measurement, 2015
Given the importance of large-scale assessments to educational policy conversations, it is critical that subpopulation achievement is estimated reliably and with sufficient precision. Despite this importance, biased subpopulation estimates have been found to occur when variables in the conditioning model side of a latent regression model contain…
Descriptors: Error of Measurement, Error Correction, Regression (Statistics), Computation
Pokropek, Artur – Sociological Methods & Research, 2015
This article combines statistical and applied research perspective showing problems that might arise when measurement error in multilevel compositional effects analysis is ignored. This article focuses on data where independent variables are constructed measures. Simulation studies are conducted evaluating methods that could overcome the…
Descriptors: Error of Measurement, Hierarchical Linear Modeling, Simulation, Evaluation Methods
Sachse, Karoline A.; Roppelt, Alexander; Haag, Nicole – Journal of Educational Measurement, 2016
Trend estimation in international comparative large-scale assessments relies on measurement invariance between countries. However, cross-national differential item functioning (DIF) has been repeatedly documented. We ran a simulation study using national item parameters, which required trends to be computed separately for each country, to compare…
Descriptors: Comparative Analysis, Measurement, Test Bias, Simulation
Wang, Wei – ProQuest LLC, 2013
Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…
Descriptors: Equated Scores, Test Format, Test Items, Test Length
Svetina, Dubravka; Rutkowski, Leslie – Large-scale Assessments in Education, 2014
Background: When studying student performance across different countries or cultures, an important aspect for comparisons is that of score comparability. In other words, it is imperative that the latent variable (i.e., construct of interest) is understood and measured equivalently across all participating groups or countries, if our inferences…
Descriptors: Test Items, Item Response Theory, Item Analysis, Regression (Statistics)