ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	4

Source

Applied Measurement in…

Author

Ben-Simon, Anat	1
Boyer, Michelle	1
Cohen, Yoav	1
Kieftenbeld, Vincent	1
Levi, Effi	1
Rupp, André A.	1
Shermis, Mark D.	1

Publication Type

Journal Articles	4
Reports - Research	2
Reports - Descriptive	1
Reports - Evaluative	1

Education Level

Audience

Location

Europe	1
Israel	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 4 results Save | Export

Establishing a Crosswalk between the Common European Framework for Languages (CEFR) and Writing Domains Scored by Automated Essay Scoring

Peer reviewed

Direct link

Shermis, Mark D. – Applied Measurement in Education, 2018

This article employs the Common European Framework Reference for Language Acquisition (CEFR) as a basis for evaluating writing in the context of machine scoring. The CEFR was designed as a framework for evaluating proficiency levels of speaking for the 49 languages comprising the European Union. The intent was to impact language instruction so…

Descriptors: Scoring, Automation, Essays, Language Proficiency

Statistically Comparing the Performance of Multiple Automated Raters across Multiple Items

Peer reviewed

Direct link

Kieftenbeld, Vincent; Boyer, Michelle – Applied Measurement in Education, 2017

Automated scoring systems are typically evaluated by comparing the performance of a single automated rater item-by-item to human raters. This presents a challenge when the performance of multiple raters needs to be compared across multiple items. Rankings could depend on specifics of the ranking procedure; observed differences could be due to…

Descriptors: Automation, Scoring, Comparative Analysis, Test Items

Validating Human and Automated Scoring of Essays against "True" Scores

Peer reviewed

Direct link

Cohen, Yoav; Levi, Effi; Ben-Simon, Anat – Applied Measurement in Education, 2018

In the current study, two pools of 250 essays, all written as a response to the same prompt, were rated by two groups of raters (14 or 15 raters per group), thereby providing an approximation to the essay's true score. An automated essay scoring (AES) system was trained on the datasets and then scored the essays using a cross-validation scheme. By…

Descriptors: Test Validity, Automation, Scoring, Computer Assisted Testing

Designing, Evaluating, and Deploying Automated Scoring Systems with Validity in Mind: Methodological Design Decisions

Peer reviewed

Direct link

Rupp, André A. – Applied Measurement in Education, 2018

This article discusses critical methodological design decisions for collecting, interpreting, and synthesizing empirical evidence during the design, deployment, and operational quality-control phases for automated scoring systems. The discussion is inspired by work on operational large-scale systems for automated essay scoring but many of the…

Descriptors: Design, Automation, Scoring, Test Scoring Machines

Automation	4
Scoring	4
Computer Assisted Testing	2
Essay Tests	2
Essays	2
Foreign Countries	2
Interrater Reliability	2
Statistical Analysis	2
Test Scoring Machines	2
Test Validity	2
Best Practices	1
Comparative Analysis	1
Competition	1
Correlation	1
Data Collection	1
Data Interpretation	1
Decision Making	1
Design	1
Differences	1
Educational Technology	1
Generalizability Theory	1
High Stakes Tests	1
Information Management	1
Interdisciplinary Approach	1
Language Proficiency	1
More ▼