ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	12

Descriptor

Comparative Analysis	15
Computer Assisted Testing	15
Interrater Reliability	15
English (Second Language)	10
Second Language Learning	10
Foreign Countries	8
Computer Software	7
Evaluators	7
Scoring	7
Correlation	6
Evaluation Methods	6
Educational Technology	5
Essays	5
Grading	4
Language Tests	4
Second Language Instruction	4
Student Attitudes	4
Writing Evaluation	4
College Students	3
Essay Tests	3
Native Speakers	3
Scores	3
Student Evaluation	3
Undergraduate Students	3
Writing Tests	3
More ▼

Source

ALT-J: Research in Learning…	1
Advances in Physiology…	1
Assessing Writing	1
British Journal of…	1
ETS Research Report Series	1
Educational Research and…	1
English Language Teaching	1
English Teaching	1
International Association for…	1
Journal of Applied Testing…	1
Journal of Speech, Language,…	1
Language Learning Journal	1
Language Testing	1
New Directions for Teaching…	1
ReCALL	1
More ▼

Publication Type

Journal Articles	14
Reports - Research	7
Reports - Evaluative	6
Tests/Questionnaires	2
Collected Works - Proceedings	1
Information Analyses	1

Education Level

Higher Education	7
Postsecondary Education	5
Elementary Secondary Education	2
Secondary Education	2
Elementary Education	1
Grade 11	1
High Schools	1
Preschool Education	1

Audience

Location

China	2
Hong Kong	2
Singapore	2
Arizona	1
Asia	1
Australia	1
Brazil	1
Connecticut	1
Denmark	1
Egypt	1
Estonia	1
Florida	1
Germany	1
Greece	1
Hawaii	1
Ireland	1
Israel	1
Italy	1
Japan	1
Kazakhstan	1
Netherlands	1
Norway	1
Ohio	1
Pakistan	1
Pennsylvania	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Expressive One Word Picture…	1
Mean Length of Utterance	1
Peabody Picture Vocabulary…	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Meta-Analysis of Inter-Rater Agreement and Discrepancy Between Human and Automated English Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Direct link

Jiyeo Yun – English Teaching, 2023

Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…

Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring

Accuracy and Reliability of Large Language Models in Assessing Learning Outcomes Achievement across Cognitive Domains

Peer reviewed

Direct link

Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…

Descriptors: Accuracy, Reliability, Computational Linguistics, Standards

Comparison of Automatic and Expert Teachers' Rating of Computerized English Listening-Speaking Test

Peer reviewed
PDF on ERIC

Download full text

Linlin, Cao – English Language Teaching, 2020

Through Many-Facet Rasch analysis, this study explores the rating differences between 1 computer automatic rater and 5 expert teacher raters on scoring 119 students in a computerized English listening-speaking test. Results indicate that both automatic and the teacher raters demonstrate good inter-rater reliability, though the automatic rater…

Descriptors: Language Tests, Computer Assisted Testing, English (Second Language), Second Language Learning

The Effects of Proficiency and Study-Abroad on Chinese EFL Learners' Refusals

Peer reviewed

Direct link

Wang, Yuqi; Ren, Wei – Language Learning Journal, 2022

L2 pragmatics have explored the effects of different factors on different aspects of learners' pragmatic performance, but often not simultaneously. In addition, syntactic complexity is rarely examined in L2 pragmatics. This cross-sectional study aimed to conduct a multidimensional analysis to explore the effects of proficiency and study-abroad…

Descriptors: Pragmatics, Second Language Learning, Second Language Instruction, English (Second Language)

The Effect of Training and Rater Differences on Oral Proficiency Assessment

Peer reviewed

Direct link

Kang, Okim; Rubin, Don; Kermad, Alyssa – Language Testing, 2019

As a result of the fact that judgments of non-native speech are closely tied to social biases, oral proficiency ratings are susceptible to error because of rater background and social attitudes. In the present study we seek first to estimate the variance attributable to rater background and attitudinal variables on novice raters' assessments of L2…

Descriptors: Evaluators, Second Language Learning, Language Tests, English (Second Language)

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

Marking Essays on Screen: An Investigation into the Reliability of Marking Extended Subjective Texts

Peer reviewed

Direct link

Johnson, Martin; Nadas, Rita; Bell, John F. – British Journal of Educational Technology, 2010

There is a growing body of research literature that considers how the mode of assessment, either computer-based or paper-based, might affect candidates' performances. Despite this, there is a fairly narrow literature that shifts the focus of attention to those making assessment judgements and which considers issues of assessor consistency when…

Descriptors: English Literature, Examiners, Evaluation Research, Evaluators

Factors that Influence Fast Mapping in Children Exposed to Spanish and English

Peer reviewed

Direct link

Alt, Mary; Meyers, Christina; Figueroa, Cecilia – Journal of Speech, Language, and Hearing Research, 2013

Purpose: The purpose of this study was to determine whether children exposed to 2 languages would benefit from the phonotactic probability cues of a single language in the same way as monolingual peers and to determine whether crosslinguistic influence would be present in a fast-mapping task. Method: Two groups of typically developing children…

Descriptors: Regression (Statistics), Spanish, Cues, Task Analysis

Typing Compared with Handwriting for Essay Examinations at University: Letting the Students Choose

Peer reviewed

Direct link

Mogey, Nora; Paterson, Jessie; Burk, John; Purcell, Michael – ALT-J: Research in Learning Technology, 2010

Students at the University of Edinburgh do almost all their work on computers, but at the end of the semester they are examined by handwritten essays. Intuitively it would be appealing to allow students the choice of handwriting or typing, but this raises a concern that perhaps this might not be "fair"--that the choice a student makes,…

Descriptors: Handwriting, Essay Tests, Interrater Reliability, Grading

A Comparison of Onscreen and Paper-Based Marking in the Hong Kong Public Examination System

Peer reviewed

Direct link

Coniam, David – Educational Research and Evaluation, 2009

This paper describes a study comparing paper-based marking (PBM) and onscreen marking (OSM) in Hong Kong utilising English language essay scripts drawn from the live 2007 Hong Kong Certificate of Education Examination (HKCEE) Year 11 English Language Writing Paper. In the study, 30 raters from the 2007 HKCEE Writing Paper marked on paper 100…

Descriptors: Student Attitudes, Foreign Countries, Essays, Comparative Analysis

Experimenting with a Computer Essay-Scoring Program Based on ESL Student Writing Scripts

Peer reviewed

Direct link

Coniam, David – ReCALL, 2009

This paper describes a study of the computer essay-scoring program BETSY. While the use of computers in rating written scripts has been criticised in some quarters for lacking transparency or lack of fit with how human raters rate written scripts, a number of essay rating programs are available commercially, many of which claim to offer comparable…

Descriptors: Writing Tests, Scoring, Foreign Countries, Interrater Reliability

Psychometric Properties of Student Ratings of Instruction in Online and On-Campus Courses

Peer reviewed

Direct link

McGhee, Debbie E.; Lowell, Nana – New Directions for Teaching and Learning, 2003

This study compares mean ratings, inter-rater reliabilities, and the factor structure of items for online and paper student-rating forms from the University of Washington's Instructional Assessment System. (Contains 3 figures and 2 tables.)

Descriptors: Psychometrics, Factor Structure, Student Evaluation of Teacher Performance, Test Items

A Comparative Study of ESL Writers' Performance in a Paper-Based and a Computer-Delivered Writing Test

Peer reviewed

Direct link

Lee, H. K. – Assessing Writing, 2004

This study aimed to comprehensively investigate the impact of a word-processor on an ESL writing assessment, covering comparison of inter-rater reliability, the quality of written products, the writing process across different testing occasions using different writing media, and students' perception of a computer-delivered test. Writing samples of…

Descriptors: Writing Evaluation, Student Attitudes, Writing Tests, Testing

Evaluating Computer Automated Scoring: Issues, Methods, and an Empirical Illustration

Peer reviewed

Direct link

Yang, Yongwei; Buckendahl, Chad W.; Juszkiewicz, Piotr J.; Bhola, Dennison S. – Journal of Applied Testing Technology, 2005

With the continual progress of computer technologies, computer automated scoring (CAS) has become a popular tool for evaluating writing assessments. Research of applications of these methodologies to new types of performance assessments is still emerging. While research has generally shown a high agreement of CAS system generated scores with those…

Descriptors: Scoring, Validity, Interrater Reliability, Comparative Analysis

Proceedings of the International Association for Development of the Information Society (IADIS) International Conference on Cognition and Exploratory Learning in Digital Age (CELDA) (Madrid, Spain, October 19-21, 2012)

Download full text

International Association for Development of the Information Society, 2012

The IADIS CELDA 2012 Conference intention was to address the main issues concerned with evolving learning processes and supporting pedagogies and applications in the digital age. There had been advances in both cognitive psychology and computing that have affected the educational arena. The convergence of these two disciplines is increasing at a…

Descriptors: Academic Achievement, Academic Persistence, Academic Support Services, Access to Computers

Coniam, David	2
Alt, Mary	1
Amanda Huee-Ping Wong	1
Bell, John F.	1
Bhola, Dennison S.	1
Breyer, F. Jay	1
Buckendahl, Chad W.	1
Burk, John	1
Figueroa, Cecilia	1
Ivan Cherh Chiet Low	1
Jiyeo Yun	1
Johnson, Martin	1
Juszkiewicz, Piotr J.	1
Kang, Okim	1
Kermad, Alyssa	1
Lee, H. K.	1
Linlin, Cao	1
Lorenz, Florian	1
Lowell, Nana	1
McGhee, Debbie E.	1
Meyers, Christina	1
Mogey, Nora	1
Nadas, Rita	1
Nathasha Vihangi Luke	1
Paterson, Jessie	1
More ▼