ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	6
Since 2016 (last 10 years)	14
Since 2006 (last 20 years)	21

Descriptor

Achievement Tests	27
Simulation	27
Test Items	27
Foreign Countries	18
Item Response Theory	16
International Assessment	13
Item Analysis	11
Secondary School Students	9
Comparative Analysis	7
Mathematics Achievement	7
Mathematics Tests	7
Statistical Analysis	7
Test Bias	7
Error of Measurement	6
Science Tests	6
Accuracy	5
Difficulty Level	5
Elementary Secondary Education	5
Evaluation Methods	5
Goodness of Fit	5
Models	5
Psychometrics	5
Science Achievement	5
Classification	4
Data Analysis	4
More ▼

Source

Journal of Educational…	3
Large-scale Assessments in…	3
Applied Measurement in…	2
Educational and Psychological…	2
Grantee Submission	2
International Journal of…	2
Journal of Educational and…	2
Alberta Journal of…	1
Educational Measurement:…	1
Journal of Educational Data…	1
ProQuest LLC	1
Psychometrika	1
Turkish Online Journal of…	1
More ▼

Publication Type

Reports - Research	21
Journal Articles	19
Reports - Evaluative	3
Reports - Descriptive	2
Dissertations/Theses -…	1
Speeches/Meeting Papers	1

Education Level

Secondary Education	11
Elementary Secondary Education	7
Grade 9	2
High Schools	2
Elementary Education	1
Grade 10	1
Grade 12	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
More ▼

Audience

Location

Canada	1
Florida	1
Germany	1
Philippines	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	10
Trends in International…	5
Advanced Placement…	1
Big Five Inventory	1
Comprehensive Tests of Basic…	1
Florida Comprehensive…	1
National Assessment of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 27 results Save | Export

Latent Program Modeling: Inferring Latent Problem-Solving Strategies from a PISA Problem-Solving Task

Peer reviewed
PDF on ERIC

Download full text

Lundgren, Erik – Journal of Educational Data Mining, 2022

Response process data have the potential to provide a rich description of test-takers' thinking processes. However, retrieving insights from these data presents a challenge for educational assessments and educational data mining as they are complex and not well annotated. The present study addresses this challenge by developing a computational…

Descriptors: Problem Solving, Classification, Accuracy, Foreign Countries

Comparing Different Trend Estimation Approaches in Country Means and Standard Deviations in International Large-Scale Assessment Studies

Peer reviewed

Direct link

Robitzsch, Alexander; Lüdtke, Oliver – Large-scale Assessments in Education, 2023

One major aim of international large-scale assessments (ILSA) like PISA is to monitor changes in student performance over time. To accomplish this task, a set of common items (i.e., link items) is repeatedly administered in each assessment. Linking methods based on item response theory (IRT) models are used to align the results from the different…

Descriptors: Educational Trends, Trend Analysis, International Assessment, Achievement Tests

Variational Estimation for Multidimensional Generalized Partial Credit Model

Peer reviewed

Direct link

Chengyu Cui; Chun Wang; Gongjun Xu – Grantee Submission, 2024

Multidimensional item response theory (MIRT) models have generated increasing interest in the psychometrics literature. Efficient approaches for estimating MIRT models with dichotomous responses have been developed, but constructing an equally efficient and robust algorithm for polytomous models has received limited attention. To address this gap,…

Descriptors: Item Response Theory, Accuracy, Simulation, Psychometrics

A Sequential Bayesian Changepoint Detection Procedure for Aberrant Behaviors in Computerized Testing

Peer reviewed
PDF on ERIC

Download full text

Direct link

Jing Lu; Chun Wang; Jiwei Zhang; Xue Wang – Grantee Submission, 2023

Changepoints are abrupt variations in a sequence of data in statistical inference. In educational and psychological assessments, it is pivotal to properly differentiate examinees' aberrant behaviors from solution behavior to ensure test reliability and validity. In this paper, we propose a sequential Bayesian changepoint detection algorithm to…

Descriptors: Bayesian Statistics, Behavior Patterns, Computer Assisted Testing, Accuracy

Comparing the Robustness of Three Nonparametric DIF Procedures to Differential Rapid Guessing

Peer reviewed

Direct link

Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022

When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…

Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis

Sensitivity of the RMSD for Detecting Item-Level Misfit in Low-Performing Countries

Peer reviewed

Direct link

Tijmstra, Jesper; Bolsinova, Maria; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2020

Although the root-mean squared deviation (RMSD) is a popular statistical measure for evaluating country-specific item-level misfit (i.e., differential item functioning [DIF]) in international large-scale assessment, this paper shows that its sensitivity to detect misfit may depend strongly on the proficiency distribution of the considered…

Descriptors: Test Items, Goodness of Fit, Probability, Accuracy

Testing Latent Variable Distribution Fit in IRT Using Posterior Residuals

Peer reviewed

Direct link

Monroe, Scott – Journal of Educational and Behavioral Statistics, 2021

This research proposes a new statistic for testing latent variable distribution fit for unidimensional item response theory (IRT) models. If the typical assumption of normality is violated, then item parameter estimates will be biased, and dependent quantities such as IRT score estimates will be adversely affected. The proposed statistic compares…

Descriptors: Item Response Theory, Simulation, Scores, Comparative Analysis

Measuring Widening Proficiency Differences in International Assessments: Are Current Approaches Enough?

Peer reviewed

Direct link

Rutkowski, David; Rutkowski, Leslie; Liaw, Yuan-Ling – Educational Measurement: Issues and Practice, 2018

Participation in international large-scale assessments has grown over time with the largest, the Programme for International Student Assessment (PISA), including more than 70 education systems that are economically and educationally diverse. To help accommodate for large achievement differences among participants, in 2009 PISA offered…

Descriptors: Educational Assessment, Foreign Countries, Achievement Tests, Secondary School Students

Item Response Data Analysis Using Stata Item Response Theory Package

Peer reviewed

Direct link

Yang, Ji Seung; Zheng, Xiaying – Journal of Educational and Behavioral Statistics, 2018

The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…

Descriptors: Item Response Theory, Item Analysis, Computer Software, Statistical Analysis

Psychometric Consequences of Subpopulation Item Parameter Drift

Peer reviewed

Direct link

Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2017

This study defines subpopulation item parameter drift (SIPD) as a change in item parameters over time that is dependent on subpopulations of examinees, and hypothesizes that the presence of SIPD in anchor items is associated with bias and/or lack of invariance in three psychometric outcomes. Results show that SIPD in anchor items is associated…

Descriptors: Psychometrics, Test Items, Item Response Theory, Hypothesis Testing

Item Calibration Samples and the Stability of Achievement Estimates and System Rankings: Another Look at the PISA Model

Peer reviewed

Direct link

Rutkowski, Leslie; Rutkowski, David; Zhou, Yan – International Journal of Testing, 2016

Using an empirically-based simulation study, we show that typically used methods of choosing an item calibration sample have significant impacts on achievement bias and system rankings. We examine whether recent PISA accommodations, especially for lower performing participants, can mitigate some of this bias. Our findings indicate that standard…

Descriptors: Simulation, International Programs, Adolescents, Student Evaluation

Modeling Skipped and Not-Reached Items Using IRTrees

Peer reviewed

Direct link

Debeer, Dries; Janssen, Rianne; De Boeck, Paul – Journal of Educational Measurement, 2017

When dealing with missing responses, two types of omissions can be discerned: items can be skipped or not reached by the test taker. When the occurrence of these omissions is related to the proficiency process the missingness is nonignorable. The purpose of this article is to present a tree-based IRT framework for modeling responses and omissions…

Descriptors: Item Response Theory, Test Items, Responses, Testing Problems

Effects of Design Properties on Parameter Estimation in Large-Scale Assessments

Peer reviewed

Direct link

Hecht, Martin; Weirich, Sebastian; Siegle, Thilo; Frey, Andreas – Educational and Psychological Measurement, 2015

The selection of an appropriate booklet design is an important element of large-scale assessments of student achievement. Two design properties that are typically optimized are the "balance" with respect to the positions the items are presented and with respect to the mutual occurrence of pairs of items in the same booklet. The purpose…

Descriptors: Measurement, Computation, Test Format, Test Items

Comparing DIF Methods for Data with Dual Dependency

Peer reviewed

Direct link

Jin, Ying; Kang, Minsoo – Large-scale Assessments in Education, 2016

Background: The current study compared four differential item functioning (DIF) methods to examine their performances in terms of accounting for dual dependency (i.e., person and item clustering effects) simultaneously by a simulation study, which is not sufficiently studied under the current DIF literature. The four methods compared are logistic…

Descriptors: Comparative Analysis, Test Bias, Simulation, Regression (Statistics)

A Comparison of Linking Methods for Estimating National Trends in International Comparative Large-Scale Assessments in the Presence of Cross-national DIF

Peer reviewed

Direct link

Sachse, Karoline A.; Roppelt, Alexander; Haag, Nicole – Journal of Educational Measurement, 2016

Trend estimation in international comparative large-scale assessments relies on measurement invariance between countries. However, cross-national differential item functioning (DIF) has been repeatedly documented. We ran a simulation study using national item parameters, which required trends to be computed separately for each country, to compare…

Descriptors: Comparative Analysis, Measurement, Test Bias, Simulation

Previous Page | Next Page »

Pages: 1 | 2

Rutkowski, Leslie	4
Rutkowski, David	3
Chun Wang	2
Liaw, Yuan-Ling	2
Meijer, Rob R.	2
Abulela, Mohammed A. A.	1
Bejar, Issac I.	1
Bolsinova, Maria	1
Carbonaro, Michael	1
Chengyu Cui	1
Dawber, Teresa	1
De Boeck, Paul	1
Debeer, Dries	1
Embretson, Susan E.	1
Frey, Andreas	1
Gattamorta, Karina A.	1
Gongjun Xu	1
Haag, Nicole	1
Hecht, Martin	1
Huggins-Manley, Anne Corinne	1
Janssen, Rianne	1
Jin, Ying	1
Jing Lu	1
Jiwei Zhang	1
More ▼