ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	7
Since 2006 (last 20 years)	8

Descriptor

Essays	8
Evaluators	8
Models	8
Writing Evaluation	6
Scoring	5
Foreign Countries	4
Computer Assisted Testing	3
Computer Software	3
English (Second Language)	3
Interrater Reliability	3
Language Tests	3
Second Language Learning	3
Accuracy	2
Artificial Intelligence	2
College Faculty	2
Comparative Analysis	2
High Stakes Tests	2
Rating Scales	2
Scores	2
Teacher Attitudes	2
Writing Skills	2
Academic Language	1
Adolescents	1
Assessment Literacy	1
Automation	1
More ▼

Source

ETS Research Report Series	2
Applied Measurement in…	1
English Language Teaching	1
International Educational…	1
Journal of Experimental…	1
Journal of University…	1
Language Testing	1

Publication Type

Journal Articles	7
Reports - Research	7
Tests/Questionnaires	2
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Higher Education	2
Postsecondary Education	2
High Schools	1
Secondary Education	1

Audience

Location

Australia	1
China	1
Germany	1
Switzerland	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…

What Works Clearinghouse Rating

Showing all 8 results Save | Export

Exploring Difficult-to-Score Essays with a Hyperbolic Cosine Accuracy Model and Coh-Metrix Indices

Peer reviewed

Direct link

Wang, Jue; Engelhard, George; Combs, Trenton – Journal of Experimental Education, 2023

Unfolding models are frequently used to develop scales for measuring attitudes. Recently, unfolding models have been applied to examine rater severity and accuracy within the context of rater-mediated assessments. One of the problems in applying unfolding models to rater-mediated assessments is that the substantive interpretations of the latent…

Descriptors: Writing Evaluation, Scoring, Accuracy, Computational Linguistics

Evaluating Quadratic Weighted Kappa as the Standard Performance Metric for Automated Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Doewes, Afrizal; Kurdhi, Nughthoh Arfawi; Saxena, Akrati – International Educational Data Mining Society, 2023

Automated Essay Scoring (AES) tools aim to improve the efficiency and consistency of essay scoring by using machine learning algorithms. In the existing research work on this topic, most researchers agree that human-automated score agreement remains the benchmark for assessing the accuracy of machine-generated scores. To measure the performance of…

Descriptors: Essays, Writing Evaluation, Evaluators, Accuracy

More Efficient Processes for Creating Automated Essay Scoring Frameworks: A Demonstration of Two Algorithms

Peer reviewed

Direct link

Shin, Jinnie; Gierl, Mark J. – Language Testing, 2021

Automated essay scoring (AES) has emerged as a secondary or as a sole marker for many high-stakes educational assessments, in native and non-native testing, owing to remarkable advances in feature engineering using natural language processing, machine learning, and deep-neural algorithms. The purpose of this study is to compare the effectiveness…

Descriptors: Scoring, Essays, Writing Evaluation, Computer Software

Predictive Modeling of Rater Behavior: Implications for Quality Assurance in Essay Scoring

Peer reviewed

Direct link

Bejar, Isaac I.; Li, Chen; McCaffrey, Daniel – Applied Measurement in Education, 2020

We evaluate the feasibility of developing predictive models of rater behavior, that is, "rater-specific" models for predicting the scores produced by a rater under operational conditions. In the present study, the dependent variable is the score assigned to essays by a rater, and the predictors are linguistic attributes of the essays…

Descriptors: Scoring, Essays, Behavior, Predictive Measurement

Establishing an Operational Model of Rating Scale Construction for English Writing Assessment

Peer reviewed
PDF on ERIC

Download full text

Wu, Xuefeng – English Language Teaching, 2022

Rating scales for writing assessment are critical in that they determine directly the quality and fairness of such performance tests. However, in many EFL contexts, rating scales are made, to certain extent, based on the intuition of teachers who strongly need a feasible and scientific route to guide their construction of rating scales. This study…

Descriptors: Writing Evaluation, Rating Scales, Second Language Learning, Second Language Instruction

Automated Essay Scoring at Scale: A Case Study in Switzerland and Germany. TOEFL® Research Report. RR-86. ETS RR-19-12

Peer reviewed
PDF on ERIC

Download full text

Rupp, André A.; Casabianca, Jodi M.; Krüger, Maleika; Keller, Stefan; Köller, Olaf – ETS Research Report Series, 2019

In this research report, we describe the design and empirical findings for a large-scale study of essay writing ability with approximately 2,500 high school students in Germany and Switzerland on the basis of 2 tasks with 2 associated prompts, each from a standardized writing assessment whose scoring involved both human and automated components.…

Descriptors: Automation, Foreign Countries, English (Second Language), Language Tests

Assessing Written Communication Skills Using a Continua Model of a Guide to Making Judgments (GTMJ)

Peer reviewed
PDF on ERIC

Download full text

Grainger, Peter R.; Christie, Michael; Carey, Michael – Journal of University Teaching and Learning Practice, 2019

Written communication skills are one of the most assessed criteria in higher education contexts, especially in humanities disciplines, including teacher education. There is a need to research and develop an assessment grading tool (i.e. criteria sheet or rubric) that would assist students in pre-service teacher education programs to better…

Descriptors: Writing Skills, Communication Skills, Models, Preservice Teachers

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

Bejar, Isaac I.	1
Breyer, F. Jay	1
Carey, Michael	1
Casabianca, Jodi M.	1
Christie, Michael	1
Combs, Trenton	1
Doewes, Afrizal	1
Engelhard, George	1
Gierl, Mark J.	1
Grainger, Peter R.	1
Keller, Stefan	1
Krüger, Maleika	1
Kurdhi, Nughthoh Arfawi	1
Köller, Olaf	1
Li, Chen	1
Lorenz, Florian	1
McCaffrey, Daniel	1
Rupp, André A.	1
Saxena, Akrati	1
Shin, Jinnie	1
Wang, Jue	1
Wu, Xuefeng	1
Zhang, Mo	1
More ▼