ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	9

Descriptor

Achievement Tests	9
Error of Measurement	9
Simulation	9
Foreign Countries	8
International Assessment	8
Test Items	6
Elementary Secondary Education	5
Mathematics Achievement	5
Mathematics Tests	5
Science Achievement	5
Science Tests	5
Item Analysis	4
Item Response Theory	4
Secondary School Students	4
Comparative Analysis	3
Evaluation Methods	3
Regression (Statistics)	3
Accuracy	2
Classification	2
Data Analysis	2
Methods	2
Statistical Bias	2
Ability	1
Achievement	1
Algorithms	1
More ▼

Source

Journal of Educational…	3
Applied Measurement in…	1
Grantee Submission	1
Journal of Educational and…	1
Large-scale Assessments in…	1
ProQuest LLC	1
Sociological Methods &…	1

Publication Type

Journal Articles	7
Reports - Research	6
Reports - Descriptive	2
Dissertations/Theses -…	1

Education Level

Elementary Secondary Education	5
Secondary Education	4
Elementary Education	1
Grade 4	1
Intermediate Grades	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Trends in International…	5
Program for International…	4
Advanced Placement…	1
Big Five Inventory	1
National Assessment of…	1
Progress in International…	1

What Works Clearinghouse Rating

Showing all 9 results Save | Export

Variational Estimation for Multidimensional Generalized Partial Credit Model

Peer reviewed

Direct link

Chengyu Cui; Chun Wang; Gongjun Xu – Grantee Submission, 2024

Multidimensional item response theory (MIRT) models have generated increasing interest in the psychometrics literature. Efficient approaches for estimating MIRT models with dichotomous responses have been developed, but constructing an equally efficient and robust algorithm for polytomous models has received limited attention. To address this gap,…

Descriptors: Item Response Theory, Accuracy, Simulation, Psychometrics

Comparing the Robustness of Three Nonparametric DIF Procedures to Differential Rapid Guessing

Peer reviewed

Direct link

Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022

When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…

Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis

Sensitivity of the RMSD for Detecting Item-Level Misfit in Low-Performing Countries

Peer reviewed

Direct link

Tijmstra, Jesper; Bolsinova, Maria; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2020

Although the root-mean squared deviation (RMSD) is a popular statistical measure for evaluating country-specific item-level misfit (i.e., differential item functioning [DIF]) in international large-scale assessment, this paper shows that its sensitivity to detect misfit may depend strongly on the proficiency distribution of the considered…

Descriptors: Test Items, Goodness of Fit, Probability, Accuracy

On the Treatment of Missing Data in Background Questionnaires in Educational Large-Scale Assessments: An Evaluation of Different Procedures

Peer reviewed

Direct link

Grund, Simon; Lüdtke, Oliver; Robitzsch, Alexander – Journal of Educational and Behavioral Statistics, 2021

Large-scale assessments (LSAs) use Mislevy's "plausible value" (PV) approach to relate student proficiency to noncognitive variables administered in a background questionnaire. This method requires background variables to be completely observed, a requirement that is seldom fulfilled. In this article, we evaluate and compare the…

Descriptors: Data Analysis, Error of Measurement, Research Problems, Statistical Inference

Correcting Measurement Error in Latent Regression Covariates via the MC-SIMEX Method

Peer reviewed

Direct link

Rutkowski, Leslie; Zhou, Yan – Journal of Educational Measurement, 2015

Given the importance of large-scale assessments to educational policy conversations, it is critical that subpopulation achievement is estimated reliably and with sufficient precision. Despite this importance, biased subpopulation estimates have been found to occur when variables in the conditioning model side of a latent regression model contain…

Descriptors: Error of Measurement, Error Correction, Regression (Statistics), Computation

Phantom Effects in Multilevel Compositional Analysis: Problems and Solutions

Peer reviewed

Direct link

Pokropek, Artur – Sociological Methods & Research, 2015

This article combines statistical and applied research perspective showing problems that might arise when measurement error in multilevel compositional effects analysis is ignored. This article focuses on data where independent variables are constructed measures. Simulation studies are conducted evaluating methods that could overcome the…

Descriptors: Error of Measurement, Hierarchical Linear Modeling, Simulation, Evaluation Methods

A Comparison of Linking Methods for Estimating National Trends in International Comparative Large-Scale Assessments in the Presence of Cross-national DIF

Peer reviewed

Direct link

Sachse, Karoline A.; Roppelt, Alexander; Haag, Nicole – Journal of Educational Measurement, 2016

Trend estimation in international comparative large-scale assessments relies on measurement invariance between countries. However, cross-national differential item functioning (DIF) has been repeatedly documented. We ran a simulation study using national item parameters, which required trends to be computed separately for each country, to compare…

Descriptors: Comparative Analysis, Measurement, Test Bias, Simulation

Mixed-Format Test Score Equating: Effect of Item-Type Multidimensionality, Length and Composition of Common-Item Set, and Group Ability Difference

Direct link

Wang, Wei – ProQuest LLC, 2013

Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…

Descriptors: Equated Scores, Test Format, Test Items, Test Length

Detecting Differential Item Functioning Using Generalized Logistic Regression in the Context of Large-Scale Assessments

Peer reviewed

Direct link

Svetina, Dubravka; Rutkowski, Leslie – Large-scale Assessments in Education, 2014

Background: When studying student performance across different countries or cultures, an important aspect for comparisons is that of score comparability. In other words, it is imperative that the latent variable (i.e., construct of interest) is understood and measured equivalently across all participating groups or countries, if our inferences…

Descriptors: Test Items, Item Response Theory, Item Analysis, Regression (Statistics)

Rutkowski, Leslie	3
Abulela, Mohammed A. A.	1
Bolsinova, Maria	1
Chengyu Cui	1
Chun Wang	1
Gongjun Xu	1
Grund, Simon	1
Haag, Nicole	1
Liaw, Yuan-Ling	1
Lüdtke, Oliver	1
Pokropek, Artur	1
Rios, Joseph A.	1
Robitzsch, Alexander	1
Roppelt, Alexander	1
Rutkowski, David	1
Sachse, Karoline A.	1
Svetina, Dubravka	1
Tijmstra, Jesper	1
Wang, Wei	1
Zhou, Yan	1
More ▼