ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	11
Since 2006 (last 20 years)	24

Descriptor

Correlation	24
Essays	24
Evaluators	24
Writing Evaluation	15
English (Second Language)	14
Scoring	13
Second Language Learning	13
Computer Assisted Testing	11
Interrater Reliability	10
Statistical Analysis	10
Comparative Analysis	8
Foreign Countries	8
Language Tests	7
Writing Tests	7
Computer Software	6
Scores	6
Scoring Rubrics	6
Undergraduate Students	6
Regression (Statistics)	5
Accuracy	4
Computational Linguistics	4
College Students	3
Construct Validity	3
Cues	3
Evaluation Criteria	3
More ▼

Publication Type

Journal Articles	22
Reports - Research	21
Tests/Questionnaires	3
Dissertations/Theses -…	1
Information Analyses	1
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Higher Education	10
Postsecondary Education	9
Secondary Education	2
Grade 11	1
High Schools	1

Audience

Location

Hong Kong	3
China	2
California	1
Indonesia	1
Ohio	1
Texas	1
Vietnam	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	3
Gates MacGinitie Reading Tests	1
International English…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 24 results Save | Export

Graders of the Future: Comparing the Consistency and Accuracy of GPT4 and Pre-Service Teachers in Physics Essay Question Assessments

Peer reviewed
PDF on ERIC

Download full text

Yubin Xu; Lin Liu; Jianwen Xiong; Guangtian Zhu – Journal of Baltic Science Education, 2025

As the development and application of large language models (LLMs) in physics education progress, the well-known AI-based chatbot ChatGPT4 has presented numerous opportunities for educational assessment. Investigating the potential of AI tools in practical educational assessment carries profound significance. This study explored the comparative…

Descriptors: Physics, Artificial Intelligence, Computer Software, Accuracy

Meta-Analysis of Inter-Rater Agreement and Discrepancy Between Human and Automated English Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Direct link

Jiyeo Yun – English Teaching, 2023

Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…

Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring

The Focus, Function and Framing of Feedback Information: Linguistic and Content Analysis of In-Text Feedback Comments

Peer reviewed

Direct link

Derham, Cathrine; Balloo, Kieran; Winstone, Naomi – Assessment & Evaluation in Higher Education, 2022

In-text comments, in the form of annotations on students' work, are a form of feedback information that should guide students to take action. Both the focus of the in-text comments, and the ways in which they are linguistically communicated, have potential to impact upon the way in which they are perceived by students. This study reports on an…

Descriptors: Feedback (Response), Content Analysis, Essays, Summative Evaluation

Using Latent Semantic Analysis to Score Short Answer Constructed Responses: Automated Scoring of the Consequences Test

Peer reviewed

Direct link

LaVoie, Noelle; Parker, James; Legree, Peter J.; Ardison, Sharon; Kilcullen, Robert N. – Educational and Psychological Measurement, 2020

Automated scoring based on Latent Semantic Analysis (LSA) has been successfully used to score essays and constrained short answer responses. Scoring tests that capture open-ended, short answer responses poses some challenges for machine learning approaches. We used LSA techniques to score short answer responses to the Consequences Test, a measure…

Descriptors: Semantics, Evaluators, Essays, Scoring

Exploring the Use of ESL Composition Profile for College Writing in the Indonesian Context

Peer reviewed
PDF on ERIC

Download full text

Setyowati, Lestari; Sukmawan, Sony; El-Sulukiyyah, Ana Ahsana – International Journal of Language Education, 2020

Assessing writing is a demanding task. If a lecturer of writing is not prepared with a reliable scoring rubric, the students' real performance might not be known. One of the well-known English as a second language (ESL) writing rubric is the Jacobs ESL Composition Profile which was developed by Jacobs, Zingraf, Wormuth, Hartfiel, & Hughey in…

Descriptors: Second Language Learning, Second Language Instruction, English (Second Language), Writing Evaluation

The Impact of Rater Variability on Relationships among Different Effect-Size Indices for Inter-Rater Agreement between Human and Automated Essay Scoring

Direct link

Yun, Jiyeo – ProQuest LLC, 2017

Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used…

Descriptors: Interrater Reliability, Essays, Scoring, Evaluators

Evaluating Comparative Judgment as an Approach to Essay Scoring

Peer reviewed

Direct link

Steedle, Jeffrey T.; Ferrara, Steve – Applied Measurement in Education, 2016

As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certain scorer biases and potentially reduces training requirements, thereby allowing a large number of judges, including teachers, to participate in essay evaluation.…

Descriptors: Essays, Scoring, Comparative Analysis, Evaluators

Linguistic Features of Humor in Academic Writing

Peer reviewed
PDF on ERIC

Download full text

Skalicky, Stephen; Berger, Cynthia M.; Crossley, Scott A.; McNamara, Danielle S. – Advances in Language and Literary Studies, 2016

A corpus of 313 freshman college essays was analyzed in order to better understand the forms and functions of humor in academic writing. Human ratings of humor and wordplay were statistically aggregated using Factor Analysis to provide an overall "Humor" component score for each essay in the corpus. In addition, the essays were also…

Descriptors: Discourse Analysis, Academic Discourse, Humor, Writing (Composition)

Validating Automated Essay Scoring: A (Modest) Refinement of the "Gold Standard"

Peer reviewed

Direct link

Powers, Donald E.; Escoffery, David S.; Duchnowski, Matthew P. – Applied Measurement in Education, 2015

By far, the most frequently used method of validating (the interpretation and use of) automated essay scores has been to compare them with scores awarded by human raters. Although this practice is questionable, human-machine agreement is still often regarded as the "gold standard." Our objective was to refine this model and apply it to…

Descriptors: Essays, Test Scoring Machines, Program Validation, Criterion Referenced Tests

Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of "WriteToLearn"

Peer reviewed
PDF on ERIC

Download full text

Liu, Sha; Kunnan, Antony John – CALICO Journal, 2016

This study investigated the application of "WriteToLearn" on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was…

Descriptors: Foreign Countries, Undergraduate Students, English (Second Language), Second Language Learning

Grounding Lexical Diversity in Human Judgments

Peer reviewed

Direct link

Jarvis, Scott – Language Testing, 2017

The present study discusses the relevance of measures of lexical diversity (LD) to the assessment of learner corpora. It also argues that existing measures of LD, many of which have become specialized for use with language corpora, are fundamentally measures of lexical repetition, are based on an etic perspective of language, and lack construct…

Descriptors: Computational Linguistics, English (Second Language), Second Language Learning, Native Speakers

Automated Trait Scores for "TOEFL"® Writing Tasks. Research Report. ETS RR-15-14

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015

The "e-rater"® automated essay scoring system is used operationally in the scoring of "TOEFL iBT"® independent and integrated tasks. In this study we explored the psychometric added value of reporting four trait scores for each of these two tasks, beyond the total e-rater score.The four trait scores are word choice, grammatical…

Descriptors: Writing Tests, Scores, Language Tests, English (Second Language)

The Differences in Error Rate and Type between IELTS Writing Bands and Their Impact on Academic Workload

Peer reviewed

Direct link

Müller, Amanda – Higher Education Research and Development, 2015

This paper attempts to demonstrate the differences in writing between International English Language Testing System (IELTS) bands 6.0, 6.5 and 7.0. An analysis of exemplars provided from the IELTS test makers reveals that IELTS 6.0, 6.5 and 7.0 writers can make a minimum of 206 errors, 96 errors and 35 errors per 1000 words. The following section…

Descriptors: English (Second Language), Second Language Learning, Language Tests, Scores

Writing Quality, Knowledge, and Comprehension Correlates of Human and Automated Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Roscoe, Rod D.; Crossley, Scott A.; Snow, Erica L.; Varner, Laura K.; McNamara, Danielle S. – Grantee Submission, 2014

Automated essay scoring tools are often criticized on the basis of construct validity. Specifically, it has been argued that computational scoring algorithms may be unaligned to higher-level indicators of quality writing, such as writers' demonstrated knowledge and understanding of the essay topics. In this paper, we consider how and whether the…

Descriptors: Correlation, Essays, Scoring, Writing Evaluation

Automated Essay Evaluation for English Language Learners: A Case Study of "MY Access"

Peer reviewed

Direct link

Hoang, Giang Thi Linh; Kunnan, Antony John – Language Assessment Quarterly, 2016

Computer technology made its way into writing instruction and assessment with spelling and grammar checkers decades ago, but more recently it has done so with automated essay evaluation (AEE) and diagnostic feedback. And although many programs and tools have been developed in the last decade, not enough research has been conducted to support or…

Descriptors: Case Studies, Essays, Writing Evaluation, English (Second Language)

Previous Page | Next Page »

Pages: 1 | 2

ETS Research Report Series	4
Applied Measurement in…	2
Language Testing	2
Advances in Language and…	1
Assessment & Evaluation in…	1
CALICO Journal	1
Contemporary Issues in…	1
Education Journal	1
Education and Information…	1
Educational Research and…	1
Educational and Psychological…	1
English Teaching	1
Grantee Submission	1
Higher Education Research and…	1
International Journal of…	1
Journal of Baltic Science…	1
Language Assessment Quarterly	1
New Horizons in Education	1
ProQuest LLC	1
More ▼

Coniam, David	3
Crossley, Scott A.	2
Kunnan, Antony John	2
McNamara, Danielle S.	2
Ardison, Sharon	1
Attali, Yigal	1
Balloo, Kieran	1
Barkaoui, Khaled	1
Berger, Cynthia M.	1
Breyer, F. Jay	1
Bridgeman, Brent	1
Brown, Michelle Stallone	1
Chaudhary, Banshi D.	1
Davey, Tim	1
Derham, Cathrine	1
Duchnowski, Matthew P.	1
El-Sulukiyyah, Ana Ahsana	1
Escoffery, David S.	1
Ferrara, Steve	1
Gentile, Claudia	1
Guangtian Zhu	1
Hoang, Giang Thi Linh	1
Jarvis, Scott	1
Jianwen Xiong	1
Jiyeo Yun	1
More ▼