ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	4
Since 2007 (last 20 years)	17

Descriptor

Computer Assisted Testing	20
Computer Software	20
Interrater Reliability	20
Scoring	10
Evaluation Methods	9
Second Language Learning	8
Comparative Analysis	7
Educational Technology	7
English (Second Language)	7
Foreign Countries	7
Correlation	6
Evaluators	6
Writing Evaluation	6
Computer Software Evaluation	5
Grading	5
Language Tests	5
Accuracy	4
Essay Tests	4
Essays	4
Scores	4
Second Language Instruction	4
Undergraduate Students	4
Artificial Intelligence	3
Classification	3
College Students	3
More ▼

Source

ETS Research Report Series	2
ALT-J: Research in Learning…	1
Advances in Physiology…	1
Assessing Writing	1
Computers & Education	1
English Language Teaching	1
International Association for…	1
International Educational…	1
Journal of Computer Assisted…	1
Journal of Educational…	1
Journal of Educational Data…	1
Journal of Experimental…	1
Journal of Interactive…	1
Journal of Speech, Language,…	1
ProQuest LLC	1
ReCALL	1
SAGE Open	1
More ▼

Publication Type

Journal Articles	15
Reports - Research	10
Reports - Evaluative	6
Tests/Questionnaires	3
Speeches/Meeting Papers	2
Books	1
Collected Works - General	1
Collected Works - Proceedings	1
Dissertations/Theses -…	1
Reports - Descriptive	1

Education Level

Higher Education	8
Postsecondary Education	8
Elementary Secondary Education	4
Secondary Education	2
Elementary Education	1
High Schools	1
Middle Schools	1
Preschool Education	1

Audience

Practitioners	1
Teachers	1

Location

Singapore	2
Turkey	2
Arizona	1
Asia	1
Australia	1
Brazil	1
China	1
Connecticut	1
Denmark	1
Egypt	1
Estonia	1
Florida	1
Germany	1
Greece	1
Hawaii	1
Hong Kong	1
Ireland	1
Israel	1
Italy	1
Japan	1
Kazakhstan	1
Netherlands	1
Norway	1
Ohio	1
Pakistan	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	2
Expressive One Word Picture…	1
Mean Length of Utterance	1
Peabody Picture Vocabulary…	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

Evaluating Quadratic Weighted Kappa as the Standard Performance Metric for Automated Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Doewes, Afrizal; Kurdhi, Nughthoh Arfawi; Saxena, Akrati – International Educational Data Mining Society, 2023

Automated Essay Scoring (AES) tools aim to improve the efficiency and consistency of essay scoring by using machine learning algorithms. In the existing research work on this topic, most researchers agree that human-automated score agreement remains the benchmark for assessing the accuracy of machine-generated scores. To measure the performance of…

Descriptors: Essays, Writing Evaluation, Evaluators, Accuracy

Accuracy and Reliability of Large Language Models in Assessing Learning Outcomes Achievement across Cognitive Domains

Peer reviewed

Direct link

Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…

Descriptors: Accuracy, Reliability, Computational Linguistics, Standards

Comparison of Automatic and Expert Teachers' Rating of Computerized English Listening-Speaking Test

Peer reviewed
PDF on ERIC

Download full text

Linlin, Cao – English Language Teaching, 2020

Through Many-Facet Rasch analysis, this study explores the rating differences between 1 computer automatic rater and 5 expert teacher raters on scoring 119 students in a computerized English listening-speaking test. Results indicate that both automatic and the teacher raters demonstrate good inter-rater reliability, though the automatic rater…

Descriptors: Language Tests, Computer Assisted Testing, English (Second Language), Second Language Learning

WordBytes: Exploring an Intermediate Constraint Format for Rapid Classification of Student Answers on Constructed Response Assessments

Peer reviewed
PDF on ERIC

Download full text

Kim, Kerry J.; Meir, Eli; Pope, Denise S.; Wendel, Daniel – Journal of Educational Data Mining, 2017

Computerized classification of student answers offers the possibility of instant feedback and improved learning. Open response (OR) questions provide greater insight into student thinking and understanding than more constrained multiple choice (MC) questions, but development of automated classifiers is more difficult, often requiring training a…

Descriptors: Classification, Computer Assisted Testing, Multiple Choice Tests, Test Format

Development of a Rubric to Assess Academic Writing Incorporating Plagiarism Detectors

Peer reviewed

Direct link

Razi, Salim – SAGE Open, 2015

Similarity reports of plagiarism detectors should be approached with caution as they may not be sufficient to support allegations of plagiarism. This study developed a 50-item rubric to simplify and standardize evaluation of academic papers. In the spring semester of 2011-2012 academic year, 161 freshmen's papers at the English Language Teaching…

Descriptors: Foreign Countries, Scoring Rubrics, Writing Evaluation, Writing (Composition)

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

Can Machine Scoring Deal with Broad and Open Writing Tests as Well as Human Readers?

Peer reviewed

Direct link

McCurry, Doug – Assessing Writing, 2010

This article considers the claim that machine scoring of writing test responses agrees with human readers as much as humans agree with other humans. These claims about the reliability of machine scoring of writing are usually based on specific and constrained writing tasks, and there is reason for asking whether machine scoring of writing requires…

Descriptors: Writing Tests, Scoring, Interrater Reliability, Computer Assisted Testing

Marking Student Programs Using Graph Similarity

Peer reviewed

Direct link

Naude, Kevin A.; Greyling, Jean H.; Vogts, Dieter – Computers & Education, 2010

We present a novel approach to the automated marking of student programming assignments. Our technique quantifies the structural similarity between unmarked student submissions and marked solutions, and is the basis by which we assign marks. This is accomplished through an efficient novel graph similarity measure ("AssignSim"). Our experiments…

Descriptors: Grading, Assignments, Correlation, Interrater Reliability

Automated Formative Assessment as a Tool to Scaffold Student Documentary Writing

Peer reviewed

Direct link

Ferster, Bill; Hammond, Thomas C.; Alexander, R. Curby; Lyman, Hunt – Journal of Interactive Learning Research, 2012

The hurried pace of the modern classroom does not permit formative feedback on writing assignments at the frequency or quality recommended by the research literature. One solution for increasing individual feedback to students is to incorporate some form of computer-generated assessment. This study explores the use of automated assessment of…

Descriptors: Feedback (Response), Scripts, Formative Evaluation, Essays

Factors that Influence Fast Mapping in Children Exposed to Spanish and English

Peer reviewed

Direct link

Alt, Mary; Meyers, Christina; Figueroa, Cecilia – Journal of Speech, Language, and Hearing Research, 2013

Purpose: The purpose of this study was to determine whether children exposed to 2 languages would benefit from the phonotactic probability cues of a single language in the same way as monolingual peers and to determine whether crosslinguistic influence would be present in a fast-mapping task. Method: Two groups of typically developing children…

Descriptors: Regression (Statistics), Spanish, Cues, Task Analysis

Typing Compared with Handwriting for Essay Examinations at University: Letting the Students Choose

Peer reviewed

Direct link

Mogey, Nora; Paterson, Jessie; Burk, John; Purcell, Michael – ALT-J: Research in Learning Technology, 2010

Students at the University of Edinburgh do almost all their work on computers, but at the end of the semester they are examined by handwritten essays. Intuitively it would be appealing to allow students the choice of handwriting or typing, but this raises a concern that perhaps this might not be "fair"--that the choice a student makes,…

Descriptors: Handwriting, Essay Tests, Interrater Reliability, Grading

Experimenting with a Computer Essay-Scoring Program Based on ESL Student Writing Scripts

Peer reviewed

Direct link

Coniam, David – ReCALL, 2009

This paper describes a study of the computer essay-scoring program BETSY. While the use of computers in rating written scripts has been criticised in some quarters for lacking transparency or lack of fit with how human raters rate written scripts, a number of essay rating programs are available commercially, many of which claim to offer comparable…

Descriptors: Writing Tests, Scoring, Foreign Countries, Interrater Reliability

Speech Recognition Software for Language Learning: Toward an Evaluation of Validity and Student Perceptions

Direct link

Cordier, Deborah – ProQuest LLC, 2009

A renewed focus on foreign language (FL) learning and speech for communication has resulted in computer-assisted language learning (CALL) software developed with Automatic Speech Recognition (ASR). ASR features for FL pronunciation (Lafford, 2004) are functional components of CALL designs used for FL teaching and learning. The ASR features…

Descriptors: Feedback (Response), Computer Assisted Instruction, Validity, Computer Software

Peering into Large Lectures: Examining Peer and Expert Mark Agreement Using peerScholar, an Online Peer Assessment Tool

Peer reviewed

Direct link

Pare, D. E.; Joordens, S. – Journal of Computer Assisted Learning, 2008

As class sizes increase, methods of assessments shift from costly traditional approaches (e.g. expert-graded writing assignments) to more economic and logistically feasible methods (e.g. multiple-choice testing, computer-automated scoring, or peer assessment). While each method of assessment has its merits, it is peer assessment in particular,…

Descriptors: Writing Assignments, Undergraduate Students, Teaching Assistants, Peer Evaluation

Computer Grading of Student Prose, Using Modern Concepts and Software.

Peer reviewed

Page, Ellis Batten – Journal of Experimental Education, 1994

National Assessment of Educational Progress writing sample essays from 1988 and 1990 (495 and 599 essays) were subjected to computerized grading and human ratings. Cross-validation suggests that computer scoring is superior to a two-judge panel, a finding encouraging for large programs of essay evaluation. (SLD)

Descriptors: Computer Assisted Testing, Computer Software, Essays, Evaluation Methods

Previous Page | Next Page »

Pages: 1 | 2

Alexander, R. Curby	1
Alt, Mary	1
Amanda Huee-Ping Wong	1
Bejar, Isaac I.	1
Breyer, F. Jay	1
Burk, John	1
Clariana, Roy B.	1
Coniam, David	1
Cordier, Deborah	1
Doewes, Afrizal	1
Ferster, Bill	1
Figueroa, Cecilia	1
Greyling, Jean H.	1
Hammond, Thomas C.	1
Hemat, Ramin	1
Ivan Cherh Chiet Low	1
Joordens, S.	1
Kim, Kerry J.	1
Kurdhi, Nughthoh Arfawi	1
Linlin, Cao	1
Lorenz, Florian	1
Lyman, Hunt	1
McCurry, Doug	1
Meir, Eli	1
Meyers, Christina	1
More ▼