ERIC - Search Results

Publication Date

In 2025	4
Since 2024	8

Source

Journal of Educational…

Author

Amery D. Wu	1
Carl Westine	1
Carolin Hahnel	1
Daria Gerasimova	1
Frank Goldhammer	1
Hamid Mohammadi	1
Jake Stone	1
Johannes Naumann	1
Joni M. Lakin	1
Kaiwen Man	1
Kylie Gorney	1
Mark J. Gierl	1
Michelle Boyer	1
Paul De Boeck	1
Sandip Sinharay	1
Seung W. Choi	1
Shun-Fu Hu	1
Sooyong Lee	1
Stella Y. Kim	1
Suhwa Han	1
Tahereh Firoozi	1
Tong Wu	1
Ulf Kroehne	1
More ▼

Publication Type

Journal Articles	8
Reports - Research	7
Reports - Descriptive	1

Education Level

Higher Education	2
Postsecondary Education	2
Elementary Education	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 8 results Save | Export

Argument-Based Approach to Validity: Developing a Living Document and Incorporating Preregistration

Peer reviewed

Direct link

Daria Gerasimova – Journal of Educational Measurement, 2024

I propose two practical advances to the argument-based approach to validity: developing a living document and incorporating preregistration. First, I present a potential structure for the living document that includes an up-to-date summary of the validity argument. As the validation process may span across multiple studies, the living document…

Descriptors: Validity, Documentation, Methods, Research Reports

A Bayesian Moderated Nonlinear Factor Analysis Approach for DIF Detection under Violation of the Equal Variance Assumption

Peer reviewed

Direct link

Sooyong Lee; Suhwa Han; Seung W. Choi – Journal of Educational Measurement, 2024

Research has shown that multiple-indicator multiple-cause (MIMIC) models can result in inflated Type I error rates in detecting differential item functioning (DIF) when the assumption of equal latent variance is violated. This study explains how the violation of the equal variance assumption adversely impacts the detection of nonuniform DIF and…

Descriptors: Factor Analysis, Bayesian Statistics, Test Bias, Item Response Theory

IRT Observed-Score Equating for Rater-Mediated Assessments Using a Hierarchical Rater Model

Peer reviewed

Direct link

Tong Wu; Stella Y. Kim; Carl Westine; Michelle Boyer – Journal of Educational Measurement, 2025

While significant attention has been given to test equating to ensure score comparability, limited research has explored equating methods for rater-mediated assessments, where human raters inherently introduce error. If not properly addressed, these errors can undermine score interchangeability and test validity. This study proposes an equating…

Descriptors: Item Response Theory, Evaluators, Error of Measurement, Test Validity

Using Automated Procedures to Score Educational Essays Written in Three Languages

Peer reviewed

Direct link

Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025

The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…

Descriptors: College Students, Slavic Languages, German, Italian

Does Timed Testing Affect the Interpretation of Efficiency Scores?--A GLMM Analysis of Reading Components

Peer reviewed

Direct link

Frank Goldhammer; Ulf Kroehne; Carolin Hahnel; Johannes Naumann; Paul De Boeck – Journal of Educational Measurement, 2024

The efficiency of cognitive component skills is typically assessed with speeded performance tests. Interpreting only effective ability or effective speed as efficiency may be challenging because of the within-person dependency between both variables (speed-ability tradeoff, SAT). The present study measures efficiency as effective ability…

Descriptors: Timed Tests, Efficiency, Scores, Test Interpretation

A Note on the Use of Categorical Subscores

Peer reviewed

Direct link

Kylie Gorney; Sandip Sinharay – Journal of Educational Measurement, 2025

Although there exists an extensive amount of research on subscores and their properties, limited research has been conducted on categorical subscores and their interpretations. In this paper, we focus on the claim of Feinberg and von Davier that categorical subscores are useful for remediation and instructional purposes. We investigate this claim…

Descriptors: Tests, Scores, Test Interpretation, Alternative Assessment

An Exploratory Study Using Innovative Graphical Network Analysis to Model Eye Movements in Spatial Reasoning Problem Solving

Peer reviewed

Direct link

Kaiwen Man; Joni M. Lakin – Journal of Educational Measurement, 2024

Eye-tracking procedures generate copious process data that could be valuable in establishing the response processes component of modern validity theory. However, there is a lack of tools for assessing and visualizing response processes using process data such as eye-tracking fixation sequences, especially those suitable for young children. This…

Descriptors: Problem Solving, Spatial Ability, Task Analysis, Network Analysis

Using Multilabel Neural Network to Score High-Dimensional Assessments for Different Use Foci: An Example with College Major Preference Assessment

Peer reviewed

Direct link

Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025

Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…

Descriptors: Tests, Testing, Scores, Test Construction

Evaluation Methods	4
Test Validity	4
Scores	3
Test Reliability	3
Validity	3
Accuracy	2
Bias	2
Item Response Theory	2
Models	2
Test Interpretation	2
Tests	2
Alternative Assessment	1
Assessment Literacy	1
Bayesian Statistics	1
Cognitive Processes	1
College Students	1
Comparative Analysis	1
Comparative Testing	1
Computer Assisted Testing	1
Construct Validity	1
Data Collection	1
Difficulty Level	1
Documentation	1
Efficiency	1
Elementary School Students	1
More ▼