ERIC - Search Results

Publication Date

In 2025	7
Since 2024	11
Since 2021 (last 5 years)	30
Since 2016 (last 10 years)	62

Source

Journal of Educational…

Publication Type

Journal Articles	62
Reports - Research	46
Reports - Evaluative	10
Reports - Descriptive	6

Education Level

Secondary Education	7
Elementary Secondary Education	2
Higher Education	2
Postsecondary Education	2

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	7
National Assessment of…	1
Program for the International…	1
Teaching and Learning…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 62 results Save | Export

A Nonparametric Composite Group DIF Index for Focal Groups Stemming from Multicategorical Variables

Peer reviewed

Direct link

Corinne Huggins-Manley; Anthony W. Raborn; Peggy K. Jones; Ted Myers – Journal of Educational Measurement, 2024

The purpose of this study is to develop a nonparametric DIF method that (a) compares focal groups directly to the composite group that will be used to develop the reported test score scale, and (b) allows practitioners to explore for DIF related to focal groups stemming from multicategorical variables that constitute a small proportion of the…

Descriptors: Nonparametric Statistics, Test Bias, Scores, Statistical Significance

A New Bayesian Person-Fit Analysis Method Using Pivotal Discrepancy Measures

Peer reviewed

Direct link

Combs, Adam – Journal of Educational Measurement, 2023

A common method of checking person-fit in Bayesian item response theory (IRT) is the posterior-predictive (PP) method. In recent years, more powerful approaches have been proposed that are based on resampling methods using the popular L*[subscript z] statistic. There has also been proposed a new Bayesian model checking method based on pivotal…

Descriptors: Bayesian Statistics, Goodness of Fit, Evaluation Methods, Monte Carlo Methods

An Exponentially Weighted Moving Average Procedure for Detecting Back Random Responding Behavior

Peer reviewed

Direct link

He, Yinhong – Journal of Educational Measurement, 2023

Back random responding (BRR) behavior is one of the commonly observed careless response behaviors. Accurately detecting BRR behavior can improve test validities. Yu and Cheng (2019) showed that the change point analysis (CPA) procedure based on weighted residual (CPA-WR) performed well in detecting BRR. Compared with the CPA procedure, the…

Descriptors: Test Validity, Item Response Theory, Measurement, Monte Carlo Methods

Detecting Differential Item Functioning Using Posterior Predictive Model Checking: A Comparison of Discrepancy Statistics

Peer reviewed

Direct link

Joo, Seang-Hwane; Lee, Philseok – Journal of Educational Measurement, 2022

Abstract This study proposes a new Bayesian differential item functioning (DIF) detection method using posterior predictive model checking (PPMC). Item fit measures including infit, outfit, observed score distribution (OSD), and Q1 were considered as discrepancy statistics for the PPMC DIF methods. The performance of the PPMC DIF method was…

Descriptors: Test Items, Bayesian Statistics, Monte Carlo Methods, Prediction

Argument-Based Approach to Validity: Developing a Living Document and Incorporating Preregistration

Peer reviewed

Direct link

Daria Gerasimova – Journal of Educational Measurement, 2024

I propose two practical advances to the argument-based approach to validity: developing a living document and incorporating preregistration. First, I present a potential structure for the living document that includes an up-to-date summary of the validity argument. As the validation process may span across multiple studies, the living document…

Descriptors: Validity, Documentation, Methods, Research Reports

Model Selection Posterior Predictive Model Checking via Limited-Information Indices for Bayesian Diagnostic Classification Modeling

Peer reviewed

Direct link

Jihong Zhang; Jonathan Templin; Xinya Liang – Journal of Educational Measurement, 2024

Recently, Bayesian diagnostic classification modeling has been becoming popular in health psychology, education, and sociology. Typically information criteria are used for model selection when researchers want to choose the best model among alternative models. In Bayesian estimation, posterior predictive checking is a flexible Bayesian model…

Descriptors: Bayesian Statistics, Cognitive Measurement, Models, Classification

Measuring the Uncertainty of Imputed Scores

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational Measurement, 2023

Technical difficulties and other unforeseen events occasionally lead to incomplete data on educational tests, which necessitates the reporting of imputed scores to some examinees. While there exist several approaches for reporting imputed scores, there is a lack of any guidance on the reporting of the uncertainty of imputed scores. In this paper,…

Descriptors: Evaluation Methods, Scores, Standardized Tests, Simulation

Using Simulated Retests to Estimate the Reliability of Diagnostic Assessment Systems

Peer reviewed

Direct link

Thompson, W. Jake; Nash, Brooke; Clark, Amy K.; Hoover, Jeffrey C. – Journal of Educational Measurement, 2023

As diagnostic classification models become more widely used in large-scale operational assessments, we must give consideration to the methods for estimating and reporting reliability. Researchers must explore alternatives to traditional reliability methods that are consistent with the design, scoring, and reporting levels of diagnostic assessment…

Descriptors: Diagnostic Tests, Simulation, Test Reliability, Accuracy

Fully Gibbs Sampling Algorithms for Bayesian Variable Selection in Latent Regression Models

Peer reviewed

Direct link

Yamaguchi, Kazuhiro; Zhang, Jihong – Journal of Educational Measurement, 2023

This study proposed Gibbs sampling algorithms for variable selection in a latent regression model under a unidimensional two-parameter logistic item response theory model. Three types of shrinkage priors were employed to obtain shrinkage estimates: double-exponential (i.e., Laplace), horseshoe, and horseshoe+ priors. These shrinkage priors were…

Descriptors: Algorithms, Simulation, Mathematics Achievement, Bayesian Statistics

Using Multiple Maximum Exposure Rates in Computerized Adaptive Testing

Peer reviewed

Direct link

Kylie Gorney; Mark D. Reckase – Journal of Educational Measurement, 2025

In computerized adaptive testing, item exposure control methods are often used to provide a more balanced usage of the item pool. Many of the most popular methods, including the restricted method (Revuelta and Ponsoda), use a single maximum exposure rate to limit the proportion of times that each item is administered. However, Barrada et al.…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Item Banks

A Unified Comparison of IRT-Based Effect Sizes for DIF Investigations

Peer reviewed

Direct link

Chalmers, R. Philip – Journal of Educational Measurement, 2023

Several marginal effect size (ES) statistics suitable for quantifying the magnitude of differential item functioning (DIF) have been proposed in the area of item response theory; for instance, the Differential Functioning of Items and Tests (DFIT) statistics, signed and unsigned item difference in the sample statistics (SIDS, UIDS, NSIDS, and…

Descriptors: Test Bias, Item Response Theory, Definitions, Monte Carlo Methods

Estimating Classification Accuracy and Consistency Indices for Multiple Measures with the Simple Structure MIRT Model

Peer reviewed

Direct link

Park, Seohee; Kim, Kyung Yong; Lee, Won-Chan – Journal of Educational Measurement, 2023

Multiple measures, such as multiple content domains or multiple types of performance, are used in various testing programs to classify examinees for screening or selection. Despite the popular usages of multiple measures, there is little research on classification consistency and accuracy of multiple measures. Accordingly, this study introduces an…

Descriptors: Testing, Computation, Classification, Accuracy

Several Variations of Simple-Structure MIRT Equating

Peer reviewed

Direct link

Kim, Stella Y.; Lee, Won-Chan – Journal of Educational Measurement, 2023

The current study proposed several variants of simple-structure multidimensional item response theory equating procedures. Four distinct sets of data were used to demonstrate feasibility of proposed equating methods for two different equating designs: a random groups design and a common-item nonequivalent groups design. Findings indicated some…

Descriptors: Item Response Theory, Equated Scores, Monte Carlo Methods, Research Methodology

Validating Performance Standards via Latent Class Analysis

Peer reviewed

Direct link

Binici, Salih; Cuhadar, Ismail – Journal of Educational Measurement, 2022

Validity of performance standards is a key element for the defensibility of standard setting results, and validating performance standards requires collecting multiple pieces of evidence at every step during the standard setting process. This study employs a statistical procedure, latent class analysis, to set performance standards and compares…

Descriptors: Validity, Performance, Standards, Multivariate Analysis

IRT Observed-Score Equating for Rater-Mediated Assessments Using a Hierarchical Rater Model

Peer reviewed

Direct link

Tong Wu; Stella Y. Kim; Carl Westine; Michelle Boyer – Journal of Educational Measurement, 2025

While significant attention has been given to test equating to ensure score comparability, limited research has explored equating methods for rater-mediated assessments, where human raters inherently introduce error. If not properly addressed, these errors can undermine score interchangeability and test validity. This study proposes an equating…

Descriptors: Item Response Theory, Evaluators, Error of Measurement, Test Validity

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Lee, Won-Chan	5
Wind, Stefanie A.	3
Castellano, Katherine E.	2
Haberman, Shelby J.	2
Jiao, Hong	2
Jones, Eli	2
Kim, Kyung Yong	2
Kim, Stella Y.	2
Kylie Gorney	2
McCaffrey, Daniel F.	2
Qiao, Xin	2
Sinharay, Sandip	2
Wilson, Mark	2
Ali, Usama S.	1
Almehrizi, Rashid S.	1
Amery D. Wu	1
Ames, Allison	1
Anthony W. Raborn	1
Artur Pokropek	1
Baldwin, Peter	1
Barrett, Michelle D.	1
Bengs, Daniel	1
Binici, Salih	1
Borgonovi, Francesca	1
Bradshaw, Laine	1
More ▼

Evaluation Methods	41
Item Response Theory	23
Monte Carlo Methods	19
Models	15
Scores	13
Simulation	13
Test Items	13
Accuracy	12
Bayesian Statistics	9
Comparative Analysis	9
Measurement	9
Classification	8
Computation	8
Error of Measurement	8
Test Validity	8
Achievement Tests	7
Computer Assisted Testing	7
Foreign Countries	7
International Assessment	7
Test Reliability	7
Secondary School Students	6
Student Evaluation	5
Test Bias	5
Adaptive Testing	4
Data Analysis	4
More ▼