ERIC - Search Results

Publication Date

In 2026	0
Since 2025	3
Since 2022 (last 5 years)	5
Since 2017 (last 10 years)	13
Since 2007 (last 20 years)	44

Descriptor

Interrater Reliability	76
Writing Tests	76
Scoring	33
Writing Evaluation	30
English (Second Language)	21
Second Language Learning	18
Foreign Countries	16
Essay Tests	15
Essays	13
Language Tests	13
Scores	13
Scoring Rubrics	13
Test Reliability	13
Computer Assisted Testing	12
Evaluation Methods	12
Comparative Analysis	11
Evaluators	11
College Students	10
Rating Scales	10
Elementary School Students	9
Item Response Theory	9
Performance Based Assessment	9
Correlation	8
Statistical Analysis	8
Test Validity	8
More ▼

Publication Type

Journal Articles	54
Reports - Research	48
Reports - Evaluative	21
Speeches/Meeting Papers	12
Tests/Questionnaires	7
Reports - Descriptive	4
Dissertations/Theses -…	2
Numerical/Quantitative Data	2
Opinion Papers	1

Education Level

Higher Education	16
Postsecondary Education	11
Secondary Education	7
Elementary Education	6
Elementary Secondary Education	6
Grade 4	4
Adult Education	3
Grade 8	3
High Schools	3
Intermediate Grades	3
Grade 7	2
Junior High Schools	2
Middle Schools	2
Grade 1	1
Grade 10	1
Grade 5	1
Grade 6	1
Grade 9	1
More ▼

Audience

Location

Turkey	4
Georgia	3
Australia	2
Canada	1
Europe	1
Hong Kong	1
Illinois (Urbana)	1
Iran	1
Iran (Tehran)	1
Japan	1
Michigan	1
New Jersey	1
New York	1
Norway	1
Tunisia	1
West Virginia	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	4
SAT (College Admission Test)	3
International English…	2
National Assessment of…	2
ACT Assessment	1
Graduate Management Admission…	1
Graduate Record Examinations	1
Metropolitan Readiness Tests	1
Praxis Series	1
Work Keys (ACT)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 76 results Save | Export

Exploring the Potential of ChatGPT for Evaluating English Essays in a Criterion-Based Assessment

Peer reviewed

Direct link

Andrea Gjorevski; Mimi Li; Troy L. Cox – TESOL Quarterly: A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second Dialect, 2025

Open access to novel AI tools offers unprecedented opportunities for human-AI collaboration in writing instruction and assessment. While research on using generative AI tools like ChatGPT in these contexts is emerging, more is needed to understand their effectiveness as Automated Writing Evaluation (AWE) tools. This study explores the potential of…

Descriptors: Artificial Intelligence, Criterion Referenced Tests, Essay Tests, Automation

Artificial Intelligence in International English Language Testing System Writing Assessments: A Comparative Study of Human Ratings and DeepAI

Peer reviewed
PDF on ERIC

Download full text

Somayeh Fathali; Fatemeh Mohajeri – Technology in Language Teaching & Learning, 2025

The International English Language Testing System (IELTS) is a high-stakes exam where Writing Task 2 significantly influences the overall scores, requiring reliable evaluation. While trained human raters perform this task, concerns about subjectivity and inconsistency have led to growing interest in artificial intelligence (AI)-based assessment…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Artificial Intelligence

Do Source Use Features Impact Raters' Judgment of Argumentation? An Experimental Study

Peer reviewed

Direct link

Ping-Lin Chuang – Language Testing, 2025

This experimental study explores how source use features impact raters' judgment of argumentation in a second language (L2) integrated writing test. One hundred four experienced and novice raters were recruited to complete a rating task that simulated the scoring assignment of a local English Placement Test (EPT). Sixty written responses were…

Descriptors: Interrater Reliability, Evaluators, Information Sources, Primary Sources

Reliability of the Analytic Rubric and Checklist for the Assessment of Story Writing Skills: G and Decision Study in Generalizability Theory

Peer reviewed
PDF on ERIC

Download full text

Uzun, N. Bilge; Alici, Devrim; Aktas, Mehtap – European Journal of Educational Research, 2019

The purpose of study is to examine the reliability of analytical rubrics and checklists developed for the assessment of story writing skills by means of generalizability theory. The study group consisted of 52 students attending the 5th grade at primary school and 20 raters in Mersin University. The G study was carried out with the fully crossed…

Descriptors: Foreign Countries, Scoring Rubrics, Check Lists, Writing Tests

"How Do Raters Learn to Rate?" Many-Facet Rasch Modeling of Rater Performance over the Course of a Rater Certification Program

Peer reviewed

Direct link

Yan, Xun; Chuang, Ping-Lin – Language Testing, 2023

This study employed a mixed-methods approach to examine how rater performance develops during a semester-long rater certification program for an English as a Second Language (ESL) writing placement test at a large US university. From 2016 to 2018, we tracked three groups of novice raters (n = 30) across four rounds in the certification program.…

Descriptors: Evaluators, Interrater Reliability, Item Response Theory, Certification

A Nonparametric Procedure for Exploring Differences in Rating Quality across Test-Taker Subgroups in Rater-Mediated Writing Assessments

Peer reviewed

Direct link

Wind, Stefanie A. – Language Testing, 2019

Differences in rater judgments that are systematically related to construct-irrelevant characteristics threaten the fairness of rater-mediated writing assessments. Accordingly, it is essential that researchers and practitioners examine the degree to which the psychometric quality of rater judgments is comparable across test-taker subgroups.…

Descriptors: Nonparametric Statistics, Interrater Reliability, Differences, Writing Tests

Exploring Incomplete Rating Designs with Mokken Scale Analysis

Peer reviewed

Direct link

Wind, Stefanie A.; Patil, Yogendra J. – Educational and Psychological Measurement, 2018

Recent research has explored the use of models adapted from Mokken scale analysis as a nonparametric approach to evaluating rating quality in educational performance assessments. A potential limiting factor to the widespread use of these techniques is the requirement for complete data, as practical constraints in operational assessment systems…

Descriptors: Scaling, Data, Interrater Reliability, Writing Tests

The Longitudinal Stability of Rating Characteristics in an EFL Examination: Methodological and Substantive Considerations

Peer reviewed

Direct link

Lamprianou, Iasonas; Tsagari, Dina; Kyriakou, Nansia – Language Testing, 2021

This longitudinal study (2002-2014) investigates the stability of rating characteristics of a large group of raters over time in the context of the writing paper of a national high-stakes examination. The study uses one measure of rater severity and two measures of rater consistency. The results suggest that the rating characteristics of…

Descriptors: Longitudinal Studies, Evaluators, High Stakes Tests, Writing Evaluation

Comparative Judgement: Assess Student Production without Absolute Judgements

Peer reviewed
PDF on ERIC

Download full text

Sumner, Josh – Research-publishing.net, 2021

Comparative Judgement (CJ) has emerged as a technique that typically makes use of holistic judgement to assess difficult-to-specify constructs such as production (speaking and writing) in Modern Foreign Languages (MFL). In traditional approaches, markers assess candidates' work one-by-one in an absolute manner, assigning scores to different…

Descriptors: Holistic Approach, Student Evaluation, Comparative Analysis, Decision Making

Development and Validation of the Written Communication Assessment of the "HEIghten"® Outcomes Assessment Suite. Research Report. ETS RR-17-53

Peer reviewed
PDF on ERIC

Download full text

Rios, Joseph A.; Sparks, Jesse R.; Zhang, Mo; Liu, Ou Lydia – ETS Research Report Series, 2017

Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack…

Descriptors: Test Construction, Test Validity, Writing Tests, College Outcomes Assessment

The Influences of Teacher Knowledge on Qualitative Writing Assessment

Peer reviewed
PDF on ERIC

Download full text

Cato, Heather; Walker, Katie – Journal of Language and Literacy Education, 2022

Standardized testing and accountability are currently unavoidable components of Texas Public Education. Through years of push-back, parents and educators have demanded that Texas consider alternative testing options that would reduce the high-stakes testing burden on students and schools. In 2015, the State of Texas passed legislation requiring…

Descriptors: Writing Evaluation, Writing Instruction, Pedagogical Content Knowledge, State Legislation

Applying a Thurstonian, Two-Stage Method in the Standardized Assessment of Writing

Peer reviewed

Direct link

McGrane, Joshua Aaron; Humphry, Stephen Mark; Heldsinger, Sandra – Applied Measurement in Education, 2018

National standardized assessment programs have increasingly included extended written performances, amplifying the need for reliable, valid, and efficient methods of assessment. This article examines a two-stage method using comparative judgments and calibrated exemplars as a complement and alternative to existing methods of assessing writing.…

Descriptors: Standardized Tests, Foreign Countries, Writing Tests, Writing Evaluation

Evaluation of "e-rater"® for the "Praxis I"®Writing Test. Research Report. ETS RR-15-03

Peer reviewed
PDF on ERIC

Download full text

Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M. – ETS Research Report Series, 2015

Automated scoring models were trained and evaluated for the essay task in the "Praxis I"® writing test. Prompt-specific and generic "e-rater"® scoring models were built, and evaluation statistics, such as quadratic weighted kappa, Pearson correlation, and standardized differences in mean scores, were examined to evaluate the…

Descriptors: Writing Tests, Licensing Examinations (Professions), Teacher Competency Testing, Scoring

Rater Strategies for Reaching Agreement on Pupil Text Quality

Peer reviewed

Direct link

Jølle, Lennart – Assessment in Education: Principles, Policy & Practice, 2015

Novice members of a Norwegian national rater panel tasked with assessing Year 8 pupils' written texts were studied during three successive preparation sessions (2011-2012). The purpose was to investigate how the raters successfully make use of different decision-making strategies in an assessment situation where pre-set criteria and standards give…

Descriptors: Interrater Reliability, Writing Evaluation, Decision Making, Novices

A Comparison of Newly-Trained and Experienced Raters on a Standardized Writing Assessment

Peer reviewed

Direct link

Attali, Yigal – Language Testing, 2016

A short training program for evaluating responses to an essay writing task consisted of scoring 20 training essays with immediate feedback about the correct score. The same scoring session also served as a certification test for trainees. Participants with little or no previous rating experience completed this session and 14 trainees who passed an…

Descriptors: Writing Evaluation, Writing Tests, Standardized Tests, Evaluators

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6

Assessing Writing	7
Language Testing	7
ETS Research Report Series	5
Educational and Psychological…	3
Journal of Technology,…	3
Applied Measurement in…	2
Assessment in Education:…	2
Journal of Educational…	2
Language Assessment Quarterly	2
Online Submission	2
ProQuest LLC	2
Assessment for Effective…	1
Chronicle of Higher Education	1
Educational Assessment	1
Educational Testing Service	1
English Language Teaching	1
Eurasian Journal of…	1
European Journal of…	1
Hacettepe University Journal…	1
JALT CALL Journal	1
Journal of Deaf Studies and…	1
Journal of Experimental…	1
Journal of Language and…	1
Language and Literacy Spectrum	1
Michigan Reading Journal	1
More ▼

Knoch, Ute	4
von Randow, Janet	3
Attali, Yigal	2
Aydin, Selami	2
Barkaoui, Khaled	2
Barkhuizen, Gary	2
Congdon, Peter J.	2
Du, Yi	2
Elder, Catherine	2
Engelhard, George, Jr.	2
Gordon, Belita	2
Johnson, Robert L.	2
McQueen, Joy	2
Wind, Stefanie A.	2
Zhang, Mo	2
Aktas, Mehtap	1
Albertini, John	1
Alici, Devrim	1
Allen, Nancy	1
Almond, Patricia	1
Anderson, Stephen A.	1
Andrea Gjorevski	1
Baker, Beverly A.	1
Baker, Eva L.	1
More ▼