ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	8
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	11

Descriptor

Evaluation Methods	16
Evaluators	16
Writing Evaluation	16
Rating Scales	7
Essays	6
Scoring	6
English (Second Language)	5
Evaluation Criteria	5
Foreign Countries	5
Second Language Learning	5
Comparative Analysis	4
Holistic Approach	4
Interrater Reliability	4
Correlation	3
Language Tests	3
Performance Based Assessment	3
Reliability	3
Statistical Analysis	3
Validity	3
Writing Skills	3
Cognitive Processes	2
Computer Software	2
Connected Discourse	2
Elementary School Students	2
Elementary Secondary Education	2
More ▼

Source

Language Testing	3
ETS Research Report Series	1
Educational Measurement:…	1
Journal of Educational and…	1
Language Assessment Quarterly	1
Language Testing in Asia	1
Research Matters	1
Studies in Higher Education	1
rEFLections	1

Publication Type

Reports - Research	13
Journal Articles	11
Tests/Questionnaires	5
Guides - Non-Classroom	2
Speeches/Meeting Papers	2
Opinion Papers	1
Reference Materials -…	1

Education Level

Higher Education	3
Postsecondary Education	3
Early Childhood Education	1
Elementary Education	1
Grade 1	1
Grade 2	1
Primary Education	1

Audience

Location

China	2
Florida	1
Thailand	1
United Kingdom	1
United Kingdom (England)	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

Triangulating Natural Language Processing (NLP)-Based Analysis of Rater Comments and Many-Facet Rasch Measurement (MFRM): An Innovative Approach to Investigating Raters' Application of Rating Scales in Writing Assessment

Peer reviewed

Direct link

Huiying Cai; Xun Yan – Language Testing, 2024

Rater comments tend to be qualitatively analyzed to indicate raters' application of rating scales. This study applied natural language processing (NLP) techniques to quantify meaningful, behavioral information from a corpus of rater comments and triangulated that information with a many-facet Rasch measurement (MFRM) analysis of rater scores. The…

Descriptors: Natural Language Processing, Item Response Theory, Rating Scales, Writing Evaluation

The Whole Is More than the Sum of Its Parts -- Assessing Writing Using the Consensual Assessment Technique

Peer reviewed

Direct link

Zahn, Daniela; Canton, Ursula; Boyd, Victoria; Hamilton, Laura; Mamo, Josianne; McKay, Jane; Proudfoot, Linda; Telfer, Dickson; Williams, Kim; Wilson, Colin – Studies in Higher Education, 2021

Evaluating the impact of Academic Literacies teaching (Lea and Street [1998. "Student Writing in Higher Education: An Academic Literacies Approach." "Studies in Higher Education" 23 (2): 157-72. doi:10.1080/03075079812331380364]) is difficult, as it involves gauging whether writers: (1) gain better understanding of what…

Descriptors: Writing Evaluation, Evaluation Methods, Undergraduate Students, Foreign Countries

Rater Cognitive Processes in Integrated Writing Tasks: From the Perspective of Problem-Solving

Peer reviewed

Direct link

Jia, Wenfeng; Zhang, Peixin – Language Testing in Asia, 2023

It is widely believed that raters' cognition is an important aspect of writing assessment, as it has both logical and temporal priority over scores. Based on a critical review of previous research in this area, it is found that raters' cognition can be boiled to two fundamental issues: building text images and strategies for articulating scores.…

Descriptors: Problem Solving, Cognitive Processes, Writing Evaluation, Evaluators

Combining Human and Automated Scoring Methods in Experimental Assessments of Writing: A Case Study Tutorial

Peer reviewed

Direct link

Reagan Mozer; Luke Miratrix; Jackie Eunjung Relyea; James S. Kim – Journal of Educational and Behavioral Statistics, 2024

In a randomized trial that collects text as an outcome, traditional approaches for assessing treatment impact require that each document first be manually coded for constructs of interest by human raters. An impact analysis can then be conducted to compare treatment and control groups, using the hand-coded scores as a measured outcome. This…

Descriptors: Scoring, Evaluation Methods, Writing Evaluation, Comparative Analysis

Judges' Views on Pairwise Comparative Judgement and Rank Ordering as Alternatives to Analytical Essay Marking

Download full text

Walland, Emma – Research Matters, 2022

In this article, I report on examiners' views and experiences of using Pairwise Comparative Judgement (PCJ) and Rank Ordering (RO) as alternatives to traditional analytical marking for GCSE English Language essays. Fifteen GCSE English Language examiners took part in the study. After each had judged 100 pairs of essays using PCJ and eight packs of…

Descriptors: Essays, Grading, Writing Evaluation, Evaluators

A Model-Data-Fit-Informed Approach to Score Resolution in Performance Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021

Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…

Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making

Towards More Valid Scoring Criteria for Integrated Reading-Writing and Listening-Writing Summary Tasks

Peer reviewed

Direct link

Chan, Sathena; May, Lyn – Language Testing, 2023

Despite the increased use of integrated tasks in high-stakes academic writing assessment, research on rating criteria which reflect the unique construct of integrated summary writing skills is comparatively rare. Using a mixed-method approach of expert judgement, text analysis, and statistical analysis, this study examines writing features that…

Descriptors: Scoring, Writing Evaluation, Reading Tests, Listening Skills

Building an Initial Validity Argument for Binary and Analytic Rating Scales for an EFL Classroom Writing Assessment: Evidence from Many-Facets Rasch Measurement

Peer reviewed
PDF on ERIC

Download full text

Khamboonruang, Apichat – rEFLections, 2022

Although much research has compared the functioning between analytic and holistic rating scales, little research has compared the functioning of binary rating scales with other types of rating scales. This quantitative study set out to preliminarily and comparatively validate binary and analytic rating scales intended for use in formative…

Descriptors: Writing Evaluation, Evaluation Methods, Second Language Learning, Second Language Instruction

Functional Adequacy in L2 Writing: Towards a New Rating Scale

Peer reviewed

Direct link

Kuiken, Folkert; Vedder, Ineke – Language Testing, 2017

The importance of functional adequacy as an essential component of L2 proficiency has been observed by several authors (Pallotti, 2009; De Jong, Steinel, Florijn, Schoonen, & Hulstijn, 2012a, b). The rationale underlying the present study is that the assessment of writing proficiency in L2 is not fully possible without taking into account the…

Descriptors: Second Language Learning, Rating Scales, Computational Linguistics, Persuasive Discourse

A Comparison of EFL Raters' Essay-Rating Processes across Two Types of Rating Scales

Peer reviewed

Direct link

Li, Hang; He, Lianzhen – Language Assessment Quarterly, 2015

This study used think-aloud protocols to compare essay-rating processes across holistic and analytic rating scales in the context of China's College English Test Band 6 (CET-6). A group of 9 experienced CET-6 raters scored the same batch of 10 CET-6 essays produced in an operational CET-6 administration twice, using both the CET-6 holistic…

Descriptors: Protocol Analysis, English (Second Language), Second Language Learning, Classification

Evaluation of the "e-rater"® Scoring Engine for the "TOEFL"® Independent and Integrated Prompts. Research Report. ETS RR-12-06

Peer reviewed
PDF on ERIC

Download full text

Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.; Davey, Tim; Bridgeman, Brent – ETS Research Report Series, 2012

Scoring models for the "e-rater"® system were built and evaluated for the "TOEFL"® exam's independent and integrated writing prompts. Prompt-specific and generic scoring models were built, and evaluation statistics, such as weighted kappas, Pearson correlations, standardized differences in mean scores, and correlations with…

Descriptors: Scoring, Prompting, Evaluators, Computer Software

A Directory of Writing Assessment Consultants.

Download full text

Bridgeford, Nancy J., Comp. – 1981

This monograph is intended to assist educators faced with the task of selecting and developing sound writing assessment procedures. Provided are: (1) practical background information on direct writing assessment procedures, alternative approaches, time requirements, advantages and disadvantages of each assessment approach, and (2) identification…

Descriptors: Consultants, Evaluation Methods, Evaluators, Scoring

Critical Monism, Critical Pluralism, and the Ideal of Inter-Rater Reliability.

Lees, Elaine O. – 1981

Given the concern for reliability in essay evaluation and the prospect of "error" variance in its absence, methods to promote interrater reliability in the evaluation of written compositions have been developed. These methods reduce variation in the value systems being applied by readers to texts, either by limiting the group of readers…

Descriptors: Elementary Secondary Education, Evaluation Criteria, Evaluation Methods, Evaluative Thinking

Learning To Rate Essays: A Study of Scorer Cognition.

Download full text

Wolfe, Edward W.; Feltovich, Brian – 1994

This paper presents a model of scored cognition that incorporates two types of mental models: models of performance (i.e., the criteria for judging performance) and models of scoring (i.e., the procedural scripts for scoring an essay). In Study 1, six novice and five experienced scorers wrote definitions of three levels of a 6-point holistic…

Descriptors: Cognitive Processes, Criteria, Essays, Evaluation Methods

Writing Portfolios: Potential for Large Scale Assessment. Project 2.4: Design Theory and Psychometrics for Complex Performance Assessment. Design and Analysis of Portfolio and Performance Measures.

Download full text

Baker, Eva L.; Linn, Robert L. – 1992

The use of portfolio assessment as a method of evaluating the writing competence of elementary school students was studied. The study contained two components: (1) an empirical study of the utility and meaningfulness of using an analytic rubric developed for the evaluation of traditional writing samples to score student portfolios; and (2) a…

Descriptors: Educational Assessment, Elementary Education, Elementary School Students, Evaluation Methods

Previous Page | Next Page »

Pages: 1 | 2

Baker, Eva L.	1
Boyd, Victoria	1
Bridgeford, Nancy J., Comp.	1
Bridgeman, Brent	1
Brossell, Gordon, Hoetker,…	1
Canton, Ursula	1
Chan, Sathena	1
Davey, Tim	1
Feltovich, Brian	1
Hamilton, Laura	1
He, Lianzhen	1
Huiying Cai	1
Jackie Eunjung Relyea	1
James S. Kim	1
Jia, Wenfeng	1
Khamboonruang, Apichat	1
Kuiken, Folkert	1
Lees, Elaine O.	1
Li, Hang	1
Linn, Robert L.	1
Luke Miratrix	1
Mamo, Josianne	1
May, Lyn	1
McKay, Jane	1
Proudfoot, Linda	1
More ▼