ERIC - Search Results

Publication Date

In 2026	0
Since 2025	2
Since 2022 (last 5 years)	11
Since 2017 (last 10 years)	27
Since 2007 (last 20 years)	54

Descriptor

Comparative Analysis	73
Reliability	73
Scoring	46
Validity	30
Foreign Countries	25
Scoring Rubrics	25
Evaluation Methods	17
Writing Evaluation	16
Correlation	15
Evaluators	12
Computer Software	10
Essays	10
Student Evaluation	10
English (Second Language)	9
Scores	9
Statistical Analysis	9
Computer Assisted Testing	8
Elementary School Students	7
Models	7
Second Language Learning	7
Undergraduate Students	7
Writing Tests	7
Accuracy	6
Language Tests	6
Psychometrics	6
More ▼

Publication Type

Reports - Research	53
Journal Articles	50
Reports - Evaluative	11
Speeches/Meeting Papers	6
Dissertations/Theses -…	5
Collected Works - General	3
Numerical/Quantitative Data	3
Reports - Descriptive	3
Tests/Questionnaires	3
Books	1

Education Level

Higher Education	19
Postsecondary Education	17
Secondary Education	13
Elementary Education	8
High Schools	5
Junior High Schools	4
Middle Schools	4
Elementary Secondary Education	3
Grade 7	3
Grade 2	2
Grade 3	2
Grade 5	2
Grade 6	2
Grade 8	2
Grade 9	2
Grade 1	1
Grade 10	1
Grade 11	1
Grade 12	1
Grade 4	1
Intermediate Grades	1
Preschool Education	1
More ▼

Audience

Researchers

Location

Australia	7
United Kingdom (England)	4
New York	3
Connecticut	2
New Hampshire	2
Rhode Island	2
Singapore	2
Vermont	2
Austria	1
Canada	1
China	1
Egypt	1
Germany	1
Hong Kong	1
India	1
Jordan	1
New Zealand	1
Nigeria	1
South Carolina	1
Spain	1
Turkey	1
United Kingdom	1
United States	1
More ▼

Laws, Policies, & Programs

Every Student Succeeds Act…

Assessments and Surveys

Trends in International…	3
National Assessment of…	2
New York State Regents…	2
Graduate Management Admission…	1
Graduate Record Examinations	1
Kaufman Assessment Battery…	1
National Assessment Program…	1
Neale Analysis of Reading…	1
Peabody Picture Vocabulary…	1
Test of English as a Foreign…	1
United States Medical…	1
Work Keys (ACT)	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 73 results Save | Export

Accuracy and Reliability of Large Language Models in Assessing Learning Outcomes Achievement across Cognitive Domains

Peer reviewed

Direct link

Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…

Descriptors: Accuracy, Reliability, Computational Linguistics, Standards

Coherence-Based Automatic Short Answer Scoring Using Sentence Embedding

Peer reviewed

Direct link

Dadi Ramesh; Suresh Kumar Sanampudi – European Journal of Education, 2024

Automatic essay scoring (AES) is an essential educational application in natural language processing. This automated process will alleviate the burden by increasing the reliability and consistency of the assessment. With the advances in text embedding libraries and neural network models, AES systems achieved good results in terms of accuracy.…

Descriptors: Scoring, Essays, Writing Evaluation, Memory

The Classification Accuracy and Consistency of Comparative Judgement of Writing Compared to Rubric-Based Teacher Assessment

Peer reviewed

Direct link

Pinot de Moira, Anne; Wheadon, Christopher; Christodoulou, Daisy – Research in Education, 2022

Writing is generally assessed internationally using rubric-based approaches, but there is a growing body of evidence to suggest that the reliability of such approaches is poor. In contrast, comparative judgement studies suggest that it is possible to assess open ended tasks such as writing with greater reliability. Many previous studies, however,…

Descriptors: Writing Evaluation, Classification, Accuracy, Scoring Rubrics

A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement

Peer reviewed

Direct link

Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024

Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…

Descriptors: Semantics, Educational Assessment, Evaluators, Reliability

Utilizing Large Language Models for EFL Essay Grading: An Examination of Reliability and Validity in Rubric-Based Assessments

Peer reviewed

Direct link

Fatih Yavuz; Özgür Çelik; Gamze Yavas Çelik – British Journal of Educational Technology, 2025

This study investigates the validity and reliability of generative large language models (LLMs), specifically ChatGPT and Google's Bard, in grading student essays in higher education based on an analytical grading rubric. A total of 15 experienced English as a foreign language (EFL) instructors and two LLMs were asked to evaluate three student…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Computational Linguistics

Comparative Judgement for Evaluating Young Learners' EFL Writing Performances: Reliability and Teacher Perceptions of Holistic and Dimension-Based Judgements

Peer reviewed

Direct link

Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025

Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…

Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction

More Efficient Processes for Creating Automated Essay Scoring Frameworks: A Demonstration of Two Algorithms

Peer reviewed

Direct link

Shin, Jinnie; Gierl, Mark J. – Language Testing, 2021

Automated essay scoring (AES) has emerged as a secondary or as a sole marker for many high-stakes educational assessments, in native and non-native testing, owing to remarkable advances in feature engineering using natural language processing, machine learning, and deep-neural algorithms. The purpose of this study is to compare the effectiveness…

Descriptors: Scoring, Essays, Writing Evaluation, Computer Software

Crowdsourced Adaptive Comparative Judgment: A Community-Based Solution for Proficiency Rating

Peer reviewed

Direct link

Paquot, Magali; Rubin, Rachel; Vandeweerd, Nathan – Language Learning, 2022

The main objective of this Methods Showcase Article is to show how the technique of adaptive comparative judgment, coupled with a crowdsourcing approach, can offer practical solutions to reliability issues as well as to address the time and cost difficulties associated with a text-based approach to proficiency assessment in L2 research. We…

Descriptors: Comparative Analysis, Decision Making, Language Proficiency, Reliability

Rubric Rating with MFRM versus Randomly Distributed Comparative Judgment: A Comparison of Two Approaches to Second-Language Writing Assessment

Peer reviewed

Direct link

Sims, Maureen E.; Cox, Troy L.; Eckstein, Grant T.; Hartshorn, K. James; Wilcox, Matthew P.; Hart, Judson M. – Educational Measurement: Issues and Practice, 2020

The purpose of this study is to explore the reliability of a potentially more practical approach to direct writing assessment in the context of ESL writing. Traditional rubric rating (RR) is a common yet resource-intensive evaluation practice when performed reliably. This study compared the traditional rubric model of ESL writing assessment and…

Descriptors: Scoring Rubrics, Item Response Theory, Second Language Learning, English (Second Language)

Re-Imagining Narrative Writing and Assessment: A Post-NAPLAN Craft-Based Rubric for Creative Writing

Peer reviewed

Direct link

Michael D. Carey; Shelley Davidow; Paul Williams – Australian Journal of Language and Literacy, 2022

According to creative writing pedagogies academic Susanne Gannon ("English in Australia, 54"(2), 43-56, 2019), and the Federal government-commissioned NAPLAN review (McGaw et al., 2020), NAPLAN has restricted how writing is taught in secondary schools. A NAPLAN-influenced structural approach to teaching writing has subsumed the…

Descriptors: Scoring Rubrics, Creative Writing, Writing Evaluation, National Competency Tests

Investigating a New Method for Standardising Essay Marking Using Levels-Based Mark Schemes

Peer reviewed
PDF on ERIC

Download full text

Greatorex, Jackie; Sutch, Tom; Werno, Magda; Bowyer, Jess; Dunn, Karen – International Journal of Assessment Tools in Education, 2019

Standardisation is a procedure used by Awarding Organisations to maximise marking reliability, by teaching examiners to consistently judge scripts using a mark scheme. However, research shows that people are better at comparing two objects than judging each object individually. Consequently, Oxford, Cambridge and RSA (OCR, a UK awarding…

Descriptors: Reliability, Achievement Rating, Standards, Scoring

Effect of Child Literature Based Integrative Instructional Program on Promoting 7th Graders Writing Skills: An Empirical Study

Peer reviewed
PDF on ERIC

Download full text

El-Freihat, Sara; Al-Shbeil, Abeer – International Journal of Instruction, 2021

The study aimed to investigate the effect of child literature based integrative instructional program on promoting 7th graders writing skills at Irbid governorate in Jordan. The sample of the study totaled (87) male and female students selected purposefully. These were randomly assigned into four groups, two experimental groups, the first was…

Descriptors: Teaching Methods, Childrens Literature, Writing Skills, Comparative Analysis

Validation of an Automated Procedure for Calculating Core Lexicon from Transcripts

Peer reviewed

Direct link

Dalton, Sarah Grace; Stark, Brielle C.; Fromm, Davida; Apple, Kristen; MacWhinney, Brian; Rensch, Amanda; Rowedder, Madyson – Journal of Speech, Language, and Hearing Research, 2022

Purpose: The aim of this study was to advance the use of structured, monologic discourse analysis by validating an automated scoring procedure for core lexicon (CoreLex) using transcripts. Method: Forty-nine transcripts from persons with aphasia and 48 transcripts from persons with no brain injury were retrieved from the AphasiaBank database. Five…

Descriptors: Validity, Discourse Analysis, Databases, Scoring

Does Enhanced Conversational Recast Promote the Learning of Grammatical Morphemes in Cantonese-Speaking Preschool Children? Answers from a Single-Case Experimental Study

Peer reviewed

Direct link

Hau, Flora F.-W.; Wong, Anita M.-Y.; Ng, Megan W.-Y. – Child Language Teaching and Therapy, 2021

Enhanced Conversational Recast (ECR) is an input-based grammatical intervention approach developed from research on statistical learning. Recent research reported evidence demonstrating the efficacy of ECR on the learning of grammatically obligatory morphemes in English-speaking preschool children with developmental language disorder (DLD). This…

Descriptors: Preschool Children, Sino Tibetan Languages, Outcomes of Treatment, Morphemes

Assessing Transversal Competences in Professional Internships: The Role of Assessment Agents

Peer reviewed
PDF on ERIC

Download full text

Romeo, Marina; Yepes-Baldó, Montserrat; González, Vicenta; Burset, Silvia; Martín, Carolina; Bosch, Emma – International Journal of Instruction, 2022

The assessment process in higher education considers four aspects: assessment agents, procedure, content, and scoring. In this study, we delve into the who. We analyze the role of transversal competence assessment agents in the framework of professional internships in university master's degree programs, comparing the suitability of their…

Descriptors: Internship Programs, Higher Education, Evaluators, Masters Programs

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

ProQuest LLC	5
Educational and Psychological…	4
Language Testing	3
Applied Measurement in…	2
Assessment & Evaluation in…	2
ETS Research Report Series	2
International Journal of…	2
Journal of Psychoeducational…	2
Journal of Speech, Language,…	2
Online Submission	2
Advances in Health Sciences…	1
Advances in Physiology…	1
Applied Linguistics	1
Applied Psychological…	1
Asia Pacific Education Review	1
Assessment	1
Assessment in Education:…	1
Australian Educational…	1
Australian Journal of…	1
Australian Journal of…	1
British Journal of…	1
Brookes Publishing Company	1
CALICO Journal	1
Child Language Teaching and…	1
Council of Chief State School…	1
More ▼

Attali, Yigal	2
Darling-Hammond, Linda	2
Martin, Michael O., Ed.	2
Mott, Michael S.	2
Abdel-Haq, Eman Muhammad	1
Abdul Gafoor, K.	1
Al-Sayed, Rania Kamal Muhammad	1
Al-Shbeil, Abeer	1
Ali, Mahsoub Abdel-Sadeq	1
Allan S. Cohen	1
Alligood, Leon	1
Amanda Huee-Ping Wong	1
Andrew, Barbara J.	1
Apple, Kristen	1
Arneson, Brian Todd	1
Bakker, J.	1
Baldwin, Peter	1
Balzotti, Jon	1
Barrueco, Sandra	1
Beaton, Albert E.	1
Beek, F. J. A.	1
Behuniak, Peter, Jr.	1
Belfi, Brian	1
Blake, Ken	1
More ▼