Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 11 |
Since 2006 (last 20 years) | 24 |
Descriptor
Source
Author
Coniam, David | 3 |
Crossley, Scott A. | 2 |
Kunnan, Antony John | 2 |
McNamara, Danielle S. | 2 |
Ardison, Sharon | 1 |
Attali, Yigal | 1 |
Balloo, Kieran | 1 |
Barkaoui, Khaled | 1 |
Berger, Cynthia M. | 1 |
Breyer, F. Jay | 1 |
Bridgeman, Brent | 1 |
More ▼ |
Publication Type
Journal Articles | 22 |
Reports - Research | 21 |
Tests/Questionnaires | 3 |
Dissertations/Theses -… | 1 |
Information Analyses | 1 |
Reports - Evaluative | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Higher Education | 10 |
Postsecondary Education | 9 |
Secondary Education | 2 |
Grade 11 | 1 |
High Schools | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 3 |
Gates MacGinitie Reading Tests | 1 |
International English… | 1 |
What Works Clearinghouse Rating
Yubin Xu; Lin Liu; Jianwen Xiong; Guangtian Zhu – Journal of Baltic Science Education, 2025
As the development and application of large language models (LLMs) in physics education progress, the well-known AI-based chatbot ChatGPT4 has presented numerous opportunities for educational assessment. Investigating the potential of AI tools in practical educational assessment carries profound significance. This study explored the comparative…
Descriptors: Physics, Artificial Intelligence, Computer Software, Accuracy
Jiyeo Yun – English Teaching, 2023
Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…
Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring
Derham, Cathrine; Balloo, Kieran; Winstone, Naomi – Assessment & Evaluation in Higher Education, 2022
In-text comments, in the form of annotations on students' work, are a form of feedback information that should guide students to take action. Both the focus of the in-text comments, and the ways in which they are linguistically communicated, have potential to impact upon the way in which they are perceived by students. This study reports on an…
Descriptors: Feedback (Response), Content Analysis, Essays, Summative Evaluation
LaVoie, Noelle; Parker, James; Legree, Peter J.; Ardison, Sharon; Kilcullen, Robert N. – Educational and Psychological Measurement, 2020
Automated scoring based on Latent Semantic Analysis (LSA) has been successfully used to score essays and constrained short answer responses. Scoring tests that capture open-ended, short answer responses poses some challenges for machine learning approaches. We used LSA techniques to score short answer responses to the Consequences Test, a measure…
Descriptors: Semantics, Evaluators, Essays, Scoring
Setyowati, Lestari; Sukmawan, Sony; El-Sulukiyyah, Ana Ahsana – International Journal of Language Education, 2020
Assessing writing is a demanding task. If a lecturer of writing is not prepared with a reliable scoring rubric, the students' real performance might not be known. One of the well-known English as a second language (ESL) writing rubric is the Jacobs ESL Composition Profile which was developed by Jacobs, Zingraf, Wormuth, Hartfiel, & Hughey in…
Descriptors: Second Language Learning, Second Language Instruction, English (Second Language), Writing Evaluation
Yun, Jiyeo – ProQuest LLC, 2017
Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used…
Descriptors: Interrater Reliability, Essays, Scoring, Evaluators
Steedle, Jeffrey T.; Ferrara, Steve – Applied Measurement in Education, 2016
As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certain scorer biases and potentially reduces training requirements, thereby allowing a large number of judges, including teachers, to participate in essay evaluation.…
Descriptors: Essays, Scoring, Comparative Analysis, Evaluators
Skalicky, Stephen; Berger, Cynthia M.; Crossley, Scott A.; McNamara, Danielle S. – Advances in Language and Literary Studies, 2016
A corpus of 313 freshman college essays was analyzed in order to better understand the forms and functions of humor in academic writing. Human ratings of humor and wordplay were statistically aggregated using Factor Analysis to provide an overall "Humor" component score for each essay in the corpus. In addition, the essays were also…
Descriptors: Discourse Analysis, Academic Discourse, Humor, Writing (Composition)
Powers, Donald E.; Escoffery, David S.; Duchnowski, Matthew P. – Applied Measurement in Education, 2015
By far, the most frequently used method of validating (the interpretation and use of) automated essay scores has been to compare them with scores awarded by human raters. Although this practice is questionable, human-machine agreement is still often regarded as the "gold standard." Our objective was to refine this model and apply it to…
Descriptors: Essays, Test Scoring Machines, Program Validation, Criterion Referenced Tests
Liu, Sha; Kunnan, Antony John – CALICO Journal, 2016
This study investigated the application of "WriteToLearn" on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was…
Descriptors: Foreign Countries, Undergraduate Students, English (Second Language), Second Language Learning
Jarvis, Scott – Language Testing, 2017
The present study discusses the relevance of measures of lexical diversity (LD) to the assessment of learner corpora. It also argues that existing measures of LD, many of which have become specialized for use with language corpora, are fundamentally measures of lexical repetition, are based on an etic perspective of language, and lack construct…
Descriptors: Computational Linguistics, English (Second Language), Second Language Learning, Native Speakers
Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015
The "e-rater"® automated essay scoring system is used operationally in the scoring of "TOEFL iBT"® independent and integrated tasks. In this study we explored the psychometric added value of reporting four trait scores for each of these two tasks, beyond the total e-rater score.The four trait scores are word choice, grammatical…
Descriptors: Writing Tests, Scores, Language Tests, English (Second Language)
Müller, Amanda – Higher Education Research and Development, 2015
This paper attempts to demonstrate the differences in writing between International English Language Testing System (IELTS) bands 6.0, 6.5 and 7.0. An analysis of exemplars provided from the IELTS test makers reveals that IELTS 6.0, 6.5 and 7.0 writers can make a minimum of 206 errors, 96 errors and 35 errors per 1000 words. The following section…
Descriptors: English (Second Language), Second Language Learning, Language Tests, Scores
Roscoe, Rod D.; Crossley, Scott A.; Snow, Erica L.; Varner, Laura K.; McNamara, Danielle S. – Grantee Submission, 2014
Automated essay scoring tools are often criticized on the basis of construct validity. Specifically, it has been argued that computational scoring algorithms may be unaligned to higher-level indicators of quality writing, such as writers' demonstrated knowledge and understanding of the essay topics. In this paper, we consider how and whether the…
Descriptors: Correlation, Essays, Scoring, Writing Evaluation
Hoang, Giang Thi Linh; Kunnan, Antony John – Language Assessment Quarterly, 2016
Computer technology made its way into writing instruction and assessment with spelling and grammar checkers decades ago, but more recently it has done so with automated essay evaluation (AEE) and diagnostic feedback. And although many programs and tools have been developed in the last decade, not enough research has been conducted to support or…
Descriptors: Case Studies, Essays, Writing Evaluation, English (Second Language)
Previous Page | Next Page »
Pages: 1 | 2