Publication Date
| In 2026 | 0 |
| Since 2025 | 2 |
| Since 2022 (last 5 years) | 11 |
| Since 2017 (last 10 years) | 27 |
| Since 2007 (last 20 years) | 54 |
Descriptor
| Comparative Analysis | 73 |
| Reliability | 73 |
| Scoring | 46 |
| Validity | 30 |
| Foreign Countries | 25 |
| Scoring Rubrics | 25 |
| Evaluation Methods | 17 |
| Writing Evaluation | 16 |
| Correlation | 15 |
| Evaluators | 12 |
| Computer Software | 10 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Higher Education | 19 |
| Postsecondary Education | 17 |
| Secondary Education | 13 |
| Elementary Education | 8 |
| High Schools | 5 |
| Junior High Schools | 4 |
| Middle Schools | 4 |
| Elementary Secondary Education | 3 |
| Grade 7 | 3 |
| Grade 2 | 2 |
| Grade 3 | 2 |
| More ▼ | |
Audience
| Researchers | 1 |
Location
| Australia | 7 |
| United Kingdom (England) | 4 |
| New York | 3 |
| Connecticut | 2 |
| New Hampshire | 2 |
| Rhode Island | 2 |
| Singapore | 2 |
| Vermont | 2 |
| Austria | 1 |
| Canada | 1 |
| China | 1 |
| More ▼ | |
Laws, Policies, & Programs
| Every Student Succeeds Act… | 2 |
Assessments and Surveys
What Works Clearinghouse Rating
Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024
The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…
Descriptors: Accuracy, Reliability, Computational Linguistics, Standards
Dadi Ramesh; Suresh Kumar Sanampudi – European Journal of Education, 2024
Automatic essay scoring (AES) is an essential educational application in natural language processing. This automated process will alleviate the burden by increasing the reliability and consistency of the assessment. With the advances in text embedding libraries and neural network models, AES systems achieved good results in terms of accuracy.…
Descriptors: Scoring, Essays, Writing Evaluation, Memory
Pinot de Moira, Anne; Wheadon, Christopher; Christodoulou, Daisy – Research in Education, 2022
Writing is generally assessed internationally using rubric-based approaches, but there is a growing body of evidence to suggest that the reliability of such approaches is poor. In contrast, comparative judgement studies suggest that it is possible to assess open ended tasks such as writing with greater reliability. Many previous studies, however,…
Descriptors: Writing Evaluation, Classification, Accuracy, Scoring Rubrics
Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024
Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…
Descriptors: Semantics, Educational Assessment, Evaluators, Reliability
Fatih Yavuz; Özgür Çelik; Gamze Yavas Çelik – British Journal of Educational Technology, 2025
This study investigates the validity and reliability of generative large language models (LLMs), specifically ChatGPT and Google's Bard, in grading student essays in higher education based on an analytical grading rubric. A total of 15 experienced English as a foreign language (EFL) instructors and two LLMs were asked to evaluate three student…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Computational Linguistics
Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025
Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…
Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction
Shin, Jinnie; Gierl, Mark J. – Language Testing, 2021
Automated essay scoring (AES) has emerged as a secondary or as a sole marker for many high-stakes educational assessments, in native and non-native testing, owing to remarkable advances in feature engineering using natural language processing, machine learning, and deep-neural algorithms. The purpose of this study is to compare the effectiveness…
Descriptors: Scoring, Essays, Writing Evaluation, Computer Software
Paquot, Magali; Rubin, Rachel; Vandeweerd, Nathan – Language Learning, 2022
The main objective of this Methods Showcase Article is to show how the technique of adaptive comparative judgment, coupled with a crowdsourcing approach, can offer practical solutions to reliability issues as well as to address the time and cost difficulties associated with a text-based approach to proficiency assessment in L2 research. We…
Descriptors: Comparative Analysis, Decision Making, Language Proficiency, Reliability
Sims, Maureen E.; Cox, Troy L.; Eckstein, Grant T.; Hartshorn, K. James; Wilcox, Matthew P.; Hart, Judson M. – Educational Measurement: Issues and Practice, 2020
The purpose of this study is to explore the reliability of a potentially more practical approach to direct writing assessment in the context of ESL writing. Traditional rubric rating (RR) is a common yet resource-intensive evaluation practice when performed reliably. This study compared the traditional rubric model of ESL writing assessment and…
Descriptors: Scoring Rubrics, Item Response Theory, Second Language Learning, English (Second Language)
Re-Imagining Narrative Writing and Assessment: A Post-NAPLAN Craft-Based Rubric for Creative Writing
Michael D. Carey; Shelley Davidow; Paul Williams – Australian Journal of Language and Literacy, 2022
According to creative writing pedagogies academic Susanne Gannon ("English in Australia, 54"(2), 43-56, 2019), and the Federal government-commissioned NAPLAN review (McGaw et al., 2020), NAPLAN has restricted how writing is taught in secondary schools. A NAPLAN-influenced structural approach to teaching writing has subsumed the…
Descriptors: Scoring Rubrics, Creative Writing, Writing Evaluation, National Competency Tests
Greatorex, Jackie; Sutch, Tom; Werno, Magda; Bowyer, Jess; Dunn, Karen – International Journal of Assessment Tools in Education, 2019
Standardisation is a procedure used by Awarding Organisations to maximise marking reliability, by teaching examiners to consistently judge scripts using a mark scheme. However, research shows that people are better at comparing two objects than judging each object individually. Consequently, Oxford, Cambridge and RSA (OCR, a UK awarding…
Descriptors: Reliability, Achievement Rating, Standards, Scoring
El-Freihat, Sara; Al-Shbeil, Abeer – International Journal of Instruction, 2021
The study aimed to investigate the effect of child literature based integrative instructional program on promoting 7th graders writing skills at Irbid governorate in Jordan. The sample of the study totaled (87) male and female students selected purposefully. These were randomly assigned into four groups, two experimental groups, the first was…
Descriptors: Teaching Methods, Childrens Literature, Writing Skills, Comparative Analysis
Dalton, Sarah Grace; Stark, Brielle C.; Fromm, Davida; Apple, Kristen; MacWhinney, Brian; Rensch, Amanda; Rowedder, Madyson – Journal of Speech, Language, and Hearing Research, 2022
Purpose: The aim of this study was to advance the use of structured, monologic discourse analysis by validating an automated scoring procedure for core lexicon (CoreLex) using transcripts. Method: Forty-nine transcripts from persons with aphasia and 48 transcripts from persons with no brain injury were retrieved from the AphasiaBank database. Five…
Descriptors: Validity, Discourse Analysis, Databases, Scoring
Hau, Flora F.-W.; Wong, Anita M.-Y.; Ng, Megan W.-Y. – Child Language Teaching and Therapy, 2021
Enhanced Conversational Recast (ECR) is an input-based grammatical intervention approach developed from research on statistical learning. Recent research reported evidence demonstrating the efficacy of ECR on the learning of grammatically obligatory morphemes in English-speaking preschool children with developmental language disorder (DLD). This…
Descriptors: Preschool Children, Sino Tibetan Languages, Outcomes of Treatment, Morphemes
Romeo, Marina; Yepes-Baldó, Montserrat; González, Vicenta; Burset, Silvia; Martín, Carolina; Bosch, Emma – International Journal of Instruction, 2022
The assessment process in higher education considers four aspects: assessment agents, procedure, content, and scoring. In this study, we delve into the who. We analyze the role of transversal competence assessment agents in the framework of professional internships in university master's degree programs, comparing the suitability of their…
Descriptors: Internship Programs, Higher Education, Evaluators, Masters Programs

Peer reviewed
Direct link
