ERIC - Search Results

Publication Date

In 2025	1
Since 2024	3
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	10
Since 2006 (last 20 years)	16

Source

Journal of Educational…

Publication Type

Journal Articles	18
Reports - Research	12
Reports - Evaluative	4
Reports - Descriptive	2

Education Level

Secondary Education	2
Higher Education	1
Postsecondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	2
Graduate Record Examinations	1

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

Model Selection Posterior Predictive Model Checking via Limited-Information Indices for Bayesian Diagnostic Classification Modeling

Peer reviewed

Direct link

Jihong Zhang; Jonathan Templin; Xinya Liang – Journal of Educational Measurement, 2024

Recently, Bayesian diagnostic classification modeling has been becoming popular in health psychology, education, and sociology. Typically information criteria are used for model selection when researchers want to choose the best model among alternative models. In Bayesian estimation, posterior predictive checking is a flexible Bayesian model…

Descriptors: Bayesian Statistics, Cognitive Measurement, Models, Classification

IRT Observed-Score Equating for Rater-Mediated Assessments Using a Hierarchical Rater Model

Peer reviewed

Direct link

Tong Wu; Stella Y. Kim; Carl Westine; Michelle Boyer – Journal of Educational Measurement, 2025

While significant attention has been given to test equating to ensure score comparability, limited research has explored equating methods for rater-mediated assessments, where human raters inherently introduce error. If not properly addressed, these errors can undermine score interchangeability and test validity. This study proposes an equating…

Descriptors: Item Response Theory, Evaluators, Error of Measurement, Test Validity

Detecting Differential Item Functioning Using Posterior Predictive Model Checking: A Comparison of Discrepancy Statistics

Peer reviewed

Direct link

Joo, Seang-Hwane; Lee, Philseok – Journal of Educational Measurement, 2022

Abstract This study proposes a new Bayesian differential item functioning (DIF) detection method using posterior predictive model checking (PPMC). Item fit measures including infit, outfit, observed score distribution (OSD), and Q1 were considered as discrepancy statistics for the PPMC DIF methods. The performance of the PPMC DIF method was…

Descriptors: Test Items, Bayesian Statistics, Monte Carlo Methods, Prediction

DIF Detection for Multiple Groups: Comparing Three-Level GLMMs and Multiple-Group IRT Models

Peer reviewed

Direct link

Carmen Köhler; Lale Khorramdel; Artur Pokropek; Johannes Hartig – Journal of Educational Measurement, 2024

For assessment scales applied to different groups (e.g., students from different states; patients in different countries), multigroup differential item functioning (MG-DIF) needs to be evaluated in order to ensure that respondents with the same trait level but from different groups have equal response probabilities on a particular item. The…

Descriptors: Measures (Individuals), Test Bias, Models, Item Response Theory

Two IRT Fixed Parameter Calibration Methods for the Bifactor Model

Peer reviewed

Direct link

Kim, Kyung Yong – Journal of Educational Measurement, 2020

New items are often evaluated prior to their operational use to obtain item response theory (IRT) item parameter estimates for quality control purposes. Fixed parameter calibration is one linking method that is widely used to estimate parameters for new items and place them on the desired scale. This article provides detailed descriptions of two…

Descriptors: Item Response Theory, Evaluation Methods, Test Items, Simulation

Pedagogical Considerations for Examining Rater Variability in Rater-Mediated Assessments: A Three-Model Framework

Peer reviewed

Direct link

Wesolowski, Brian C.; Wind, Stefanie A. – Journal of Educational Measurement, 2019

Rater-mediated assessments are a common methodology for measuring persons, investigating rater behavior, and/or defining latent constructs. The purpose of this article is to provide a pedagogical framework for examining rater variability in the context of rater-mediated assessments using three distinct models. The first model is the observation…

Descriptors: Interrater Reliability, Models, Observation, Measurement

The Effects of Incomplete Rating Designs in Combination with Rater Effects

Peer reviewed

Direct link

Wind, Stefanie A.; Jones, Eli – Journal of Educational Measurement, 2019

Researchers have explored a variety of topics related to identifying and distinguishing among specific types of rater effects, as well as the implications of different types of incomplete data collection designs for rater-mediated assessments. In this study, we used simulated data to examine the sensitivity of latent trait model indicators of…

Descriptors: Rating Scales, Models, Evaluators, Data Collection

Scale Alignment in Between-Item Multidimensional Rasch Models

Peer reviewed

Direct link

Feuerstahler, Leah; Wilson, Mark – Journal of Educational Measurement, 2019

Scores estimated from multidimensional item response theory (IRT) models are not necessarily comparable across dimensions. In this article, the concept of aligned dimensions is formalized in the context of Rasch models, and two methods are described--delta dimensional alignment (DDA) and logistic regression alignment (LRA)--to transform estimated…

Descriptors: Item Response Theory, Models, Scores, Comparative Analysis

Assessment of Differential Item Functioning under Cognitive Diagnosis Models: The DINA Model Example

Peer reviewed

Direct link

Li, Xiaomin; Wang, Wen-Chung – Journal of Educational Measurement, 2015

The assessment of differential item functioning (DIF) is routinely conducted to ensure test fairness and validity. Although many DIF assessment methods have been developed in the context of classical test theory and item response theory, they are not applicable for cognitive diagnosis models (CDMs), as the underlying latent attributes of CDMs are…

Descriptors: Test Bias, Models, Cognitive Measurement, Evaluation Methods

Mapping an Experiment-Based Assessment of Collaborative Behavior onto Collaborative Problem Solving in PISA 2015: A Cluster Analysis Approach for Collaborator Profiles

Peer reviewed

Direct link

Herborn, Katharina; Mustafic, Maida; Greiff, Samuel – Journal of Educational Measurement, 2017

Collaborative problem solving (CPS) assessment is a new academic research field with a number of educational implications. In 2015, the Programme for International Student Assessment (PISA) assessed CPS with a computer-simulated human-agent (H-A) approach that claimed to measure 12 individual CPS skills for the first time. After reviewing the…

Descriptors: Cooperative Learning, Problem Solving, Computer Simulation, Evaluation Methods

Modeling Data from Collaborative Assessments: Learning in Digital Interactive Social Networks

Peer reviewed

Direct link

Wilson, Mark; Gochyyev, Perman; Scalise, Kathleen – Journal of Educational Measurement, 2017

This article summarizes assessment of cognitive skills through collaborative tasks, using field test results from the Assessment and Teaching of 21st Century Skills (ATC21S) project. This project, sponsored by Cisco, Intel, and Microsoft, aims to help educators around the world enable students with the skills to succeed in future career and…

Descriptors: Cognitive Ability, Thinking Skills, Evaluation Methods, Educational Assessment

Differential Item Functioning Assessment in Cognitive Diagnostic Modeling: Application of the Wald Test to Investigate DIF in the DINA Model

Peer reviewed

Direct link

Hou, Likun; de la Torre, Jimmy; Nandakumar, Ratna – Journal of Educational Measurement, 2014

Analyzing examinees' responses using cognitive diagnostic models (CDMs) has the advantage of providing diagnostic information. To ensure the validity of the results from these models, differential item functioning (DIF) in CDMs needs to be investigated. In this article, the Wald test is proposed to examine DIF in the context of CDMs. This study…

Descriptors: Test Bias, Models, Simulation, Error Patterns

Multilevel Modeling of Item Position Effects

Peer reviewed

Direct link

Albano, Anthony D. – Journal of Educational Measurement, 2013

In many testing programs it is assumed that the context or position in which an item is administered does not have a differential effect on examinee responses to the item. Violations of this assumption may bias item response theory estimates of item and person parameters. This study examines the potentially biasing effects of item position. A…

Descriptors: Test Items, Item Response Theory, Test Format, Questioning Techniques

An Empirically Based Method of Q-Matrix Validation for the DINA Model: Development and Applications

Peer reviewed

Direct link

de la Torre, Jimmy – Journal of Educational Measurement, 2008

Most model fit analyses in cognitive diagnosis assume that a Q matrix is correct after it has been constructed, without verifying its appropriateness. Consequently, any model misfit attributable to the Q matrix cannot be addressed and remedied. To address this concern, this paper proposes an empirically based method of validating a Q matrix used…

Descriptors: Matrices, Validity, Models, Evaluation Methods

Model-Free CUSUM Methods for Person Fit

Peer reviewed

Direct link

Armstrong, Ronald D.; Shi, Min – Journal of Educational Measurement, 2009

This article demonstrates the use of a new class of model-free cumulative sum (CUSUM) statistics to detect person fit given the responses to a linear test. The fundamental statistic being accumulated is the likelihood ratio of two probabilities. The detection performance of this CUSUM scheme is compared to other model-free person-fit statistics…

Descriptors: Probability, Simulation, Models, Psychometrics

Previous Page | Next Page »

Pages: 1 | 2

Evaluation Methods	21
Models	21
Item Response Theory	8
Simulation	7
Test Items	6
Test Bias	5
Item Analysis	4
Cognitive Measurement	3
Comparative Analysis	3
Computer Simulation	3
Educational Assessment	3
Error Patterns	3
Error of Measurement	3
Factor Analysis	3
Hierarchical Linear Modeling	3
Student Evaluation	3
Achievement Tests	2
Bayesian Statistics	2
Classification	2
Computation	2
Cooperative Learning	2
Data Analysis	2
Data Collection	2
Educational Testing	2
Evaluators	2
More ▼

Nandakumar, Ratna	2
Wilson, Mark	2
Wind, Stefanie A.	2
de la Torre, Jimmy	2
Albano, Anthony D.	1
Armstrong, Ronald D.	1
Artur Pokropek	1
Biggs, J. B.	1
Braun, P. H.	1
Carl Westine	1
Carmen Köhler	1
Clauser, Brian E.	1
Emrick, John A.	1
Feuerstahler, Leah	1
Gochyyev, Perman	1
Greiff, Samuel	1
Herborn, Katharina	1
Hou, Likun	1
Jihong Zhang	1
Johannes Hartig	1
Jonathan Templin	1
Jones, Eli	1
Joo, Seang-Hwane	1
Kim, Kyung Yong	1
More ▼