ERIC - Search Results

Publication Date

In 2026	0
Since 2025	2
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	3
Since 2007 (last 20 years)	4

Descriptor

Computer Assisted Testing	4
Test Reliability	3
Decision Making	2
Evaluation Methods	2
Test Construction	2
Test Validity	2
Adaptive Testing	1
Attribution Theory	1
Automation	1
College Students	1
Comparative Testing	1
Computer Simulation	1
Design Requirements	1
Formative Evaluation	1
German	1
Grading	1
Group Testing	1
High Stakes Tests	1
Interrater Reliability	1
Italian	1
Item Analysis	1
Item Response Theory	1
Models	1
Multilingual Materials	1
Reliability	1
More ▼

Source

Journal of Educational…

Author

Chang, Hua-Hua	1
Douglas, Jeff	1
Hamid Mohammadi	1
Jinnie Shin	1
Joo, Seang-Hwane	1
Lee, Philseok	1
Lin, Haiyan	1
Mark J. Gierl	1
Stark, Stephen	1
Tahereh Firoozi	1
Wallace N. Pinto Jr.	1
Wang, Shiyu	1
More ▼

Publication Type

Journal Articles	4
Reports - Research	3
Reports - Evaluative	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 4 results Save | Export

Using Automated Procedures to Score Educational Essays Written in Three Languages

Peer reviewed

Direct link

Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025

The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…

Descriptors: College Students, Slavic Languages, German, Italian

Evaluating the Consistency and Reliability of Attribution Methods in Automated Short Answer Grading (ASAG) Systems: Toward an Explainable Scoring System

Peer reviewed

Direct link

Wallace N. Pinto Jr.; Jinnie Shin – Journal of Educational Measurement, 2025

In recent years, the application of explainability techniques to automated essay scoring and automated short-answer grading (ASAG) models, particularly those based on transformer architectures, has gained significant attention. However, the reliability and consistency of these techniques remain underexplored. This study systematically investigates…

Descriptors: Automation, Grading, Computer Assisted Testing, Scoring

Development of Information Functions and Indices for the GGUM-RANK Multidimensional Forced Choice IRT Model

Peer reviewed

Direct link

Joo, Seang-Hwane; Lee, Philseok; Stark, Stephen – Journal of Educational Measurement, 2018

This research derived information functions and proposed new scalar information indices to examine the quality of multidimensional forced choice (MFC) items based on the RANK model. We also explored how GGUM-RANK information, latent trait recovery, and reliability varied across three MFC formats: pairs (two response alternatives), triplets (three…

Descriptors: Item Response Theory, Models, Item Analysis, Reliability

Hybrid Computerized Adaptive Testing: From Group Sequential Design to Fully Sequential Design

Peer reviewed

Direct link

Wang, Shiyu; Lin, Haiyan; Chang, Hua-Hua; Douglas, Jeff – Journal of Educational Measurement, 2016

Computerized adaptive testing (CAT) and multistage testing (MST) have become two of the most popular modes in large-scale computer-based sequential testing. Though most designs of CAT and MST exhibit strength and weakness in recent large-scale implementations, there is no simple answer to the question of which design is better because different…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Format, Sequential Approach