ERIC - Search Results

Publication Date

In 2025	1
Since 2024	5
Since 2021 (last 5 years)	12
Since 2016 (last 10 years)	19
Since 2006 (last 20 years)	30

Descriptor

Evaluators	30
Language Tests	30
Writing Evaluation	30
Second Language Learning	26
English (Second Language)	24
Essays	14
Foreign Countries	13
Writing Tests	11
Language Proficiency	10
Correlation	9
Rating Scales	9
Scores	9
Scoring	8
Comparative Analysis	7
Evaluation Criteria	7
Second Language Instruction	6
Writing Skills	6
Accuracy	5
Computer Assisted Testing	5
Computer Software	5
Computational Linguistics	4
Item Response Theory	4
Undergraduate Students	4
Construct Validity	3
Decision Making	3
More ▼

Source

Language Testing	9
Language Testing in Asia	4
ETS Research Report Series	3
Language Assessment Quarterly	3
International Journal of…	2
Asian-Pacific Journal of…	1
Computer Assisted Language…	1
English Language Teaching	1
Grantee Submission	1
Higher Education Research and…	1
Language Awareness	1
Language Teaching Research…	1
SAGE Open	1
TESOL Quarterly: A Journal…	1
More ▼

Publication Type

Journal Articles	30
Reports - Research	29
Tests/Questionnaires	5
Information Analyses	1
Reports - Descriptive	1

Education Level

Higher Education	10
Postsecondary Education	7
Secondary Education	4
High Schools	3
Elementary Education	1
Grade 12	1

Audience

Location

China	3
Turkey	2
Belgium	1
Europe	1
Germany	1
Hawaii	1
Iran	1
Japan	1
Netherlands	1
Pakistan	1
South Korea	1
Spain	1
Switzerland	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

International English…	7
Test of English as a Foreign…	5

What Works Clearinghouse Rating

Showing 1 to 15 of 30 results Save | Export

Triangulating Natural Language Processing (NLP)-Based Analysis of Rater Comments and Many-Facet Rasch Measurement (MFRM): An Innovative Approach to Investigating Raters' Application of Rating Scales in Writing Assessment

Peer reviewed

Direct link

Huiying Cai; Xun Yan – Language Testing, 2024

Rater comments tend to be qualitatively analyzed to indicate raters' application of rating scales. This study applied natural language processing (NLP) techniques to quantify meaningful, behavioral information from a corpus of rater comments and triangulated that information with a many-facet Rasch measurement (MFRM) analysis of rater scores. The…

Descriptors: Natural Language Processing, Item Response Theory, Rating Scales, Writing Evaluation

Assessing the Content Quality of Essays in Content and Language Integrated Learning: Exploring the Construct from Subject Specialists' Perspectives

Peer reviewed

Direct link

Takanori Sato – Language Testing, 2024

Assessing the content of learners' compositions is a common practice in second language (L2) writing assessment. However, the construct definition of content in L2 writing assessment potentially underrepresents the target competence in content and language integrated learning (CLIL), which aims to foster not only L2 proficiency but also critical…

Descriptors: Language Tests, Content and Language Integrated Learning, Writing Evaluation, Writing Tests

Making Each Point Count: Revising a Local Adaptation of the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE Rubric

Peer reviewed

Direct link

Yu-Tzu Chang; Ann Tai Choe; Daniel Holden; Daniel R. Isbell – Language Testing, 2024

In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016-2021, including 434 essays) using…

Descriptors: Language Tests, Rating Scales, Second Language Learning, English (Second Language)

Rater Cognitive Processes in Integrated Writing Tasks: From the Perspective of Problem-Solving

Peer reviewed

Direct link

Jia, Wenfeng; Zhang, Peixin – Language Testing in Asia, 2023

It is widely believed that raters' cognition is an important aspect of writing assessment, as it has both logical and temporal priority over scores. Based on a critical review of previous research in this area, it is found that raters' cognition can be boiled to two fundamental issues: building text images and strategies for articulating scores.…

Descriptors: Problem Solving, Cognitive Processes, Writing Evaluation, Evaluators

Development and Validation of a Rating Scale for Summarization as an Integrated Task

Peer reviewed

Direct link

Li, Jiuliang; Wang, Qian – Asian-Pacific Journal of Second and Foreign Language Education, 2021

Summary writing is essential for academic success, and has attracted renewed interest in academic research and large-scale language test. However, less attention has been paid to the development and evaluation of the scoring scales of summary writing. This study reports on the validation of a summary rubric that represented an approach to scale…

Descriptors: Validity, Rating Scales, Writing Skills, Writing Evaluation

The Intersection of AI and Language Assessment: A Study on the Reliability of ChatGPT in Grading IELTS Writing Task 2

Peer reviewed
PDF on ERIC

Download full text

Osama Koraishi – Language Teaching Research Quarterly, 2024

This study conducts a comprehensive quantitative evaluation of OpenAI's language model, ChatGPT 4, for grading Task 2 writing of the IELTS exam. The objective is to assess the alignment between ChatGPT's grading and that of official human raters. The analysis encompassed a multifaceted approach, including a comparison of means and reliability…

Descriptors: Second Language Learning, English (Second Language), Language Tests, Artificial Intelligence

Artificial Intelligence as an Automated Essay Scoring Tool: A Focus on ChatGPT

Peer reviewed
PDF on ERIC

Download full text

Ahmet Can Uyar; Dilek Büyükahiska – International Journal of Assessment Tools in Education, 2025

This study explores the effectiveness of using ChatGPT, an Artificial Intelligence (AI) language model, as an Automated Essay Scoring (AES) tool for grading English as a Foreign Language (EFL) learners' essays. The corpus consists of 50 essays representing various types including analysis, compare and contrast, descriptive, narrative, and opinion…

Descriptors: Artificial Intelligence, Computer Software, Technology Uses in Education, Teaching Methods

A Sequential Approach to Detecting Differential Rater Functioning in Sparse Rater-Mediated Assessment Networks

Peer reviewed

Direct link

Wind, Stefanie A. – Language Testing, 2023

Researchers frequently evaluate rater judgments in performance assessments for evidence of differential rater functioning (DRF), which occurs when rater severity is systematically related to construct-irrelevant student characteristics after controlling for student achievement levels. However, researchers have observed that methods for detecting…

Descriptors: Evaluators, Decision Making, Student Characteristics, Performance Based Assessment

Raters' Perceptions of Rating Scales Criteria and Its Effect on the Process and Outcome of Their Rating

Peer reviewed

Direct link

Heidari, Nasim; Ghanbari, Nasim; Abbasi, Abbas – Language Testing in Asia, 2022

It is widely believed that human rating performance is influenced by an array of different factors. Among these, rater-related variables such as experience, language background, perceptions, and attitudes have been mentioned. One of the important rater-related factors is the way the raters interact with the rating scales. In particular, how raters…

Descriptors: Evaluators, Rating Scales, Language Tests, English (Second Language)

Investigating the Impact of Rater Training on Rater Errors in the Process of Assessing Writing Skill

Peer reviewed
PDF on ERIC

Download full text

Sata, Mehmet; Karakaya, Ismail – International Journal of Assessment Tools in Education, 2022

In the process of measuring and assessing high-level cognitive skills, interference of rater errors in measurements brings about a constant concern and low objectivity. The main purpose of this study was to investigate the impact of rater training on rater errors in the process of assessing individual performance. The study was conducted with a…

Descriptors: Evaluators, Training, Comparative Analysis, Academic Language

'Remark or Retake'? A Study of Candidate Performance in IELTS and Perceptions towards Test Failure

Peer reviewed

Direct link

Pearson, William S. – Language Testing in Asia, 2019

It is becoming increasingly important for individuals for whom English is a second language to demonstrate their linguistic credentials for academic, work and employment purposes. One option is to undertake International English Language Testing System (IELTS), which involves attempting to meet the linguistic entrance criteria set by a gatekeeping…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Cutting Scores

Establishing an Operational Model of Rating Scale Construction for English Writing Assessment

Peer reviewed
PDF on ERIC

Download full text

Wu, Xuefeng – English Language Teaching, 2022

Rating scales for writing assessment are critical in that they determine directly the quality and fairness of such performance tests. However, in many EFL contexts, rating scales are made, to certain extent, based on the intuition of teachers who strongly need a feasible and scientific route to guide their construction of rating scales. This study…

Descriptors: Writing Evaluation, Rating Scales, Second Language Learning, Second Language Instruction

The Effect of Using Machine Translation on Linguistic Features in L2 Writing across Proficiency Levels and Text Genres

Peer reviewed

Direct link

Chung, Eun Seon; Ahn, Soojin – Computer Assisted Language Learning, 2022

Many studies that have investigated the educational value of online machine translation (MT) in second language (L2) writing generally report significant improvements after MT use, but no study as of yet has comprehensively analyzed the effectiveness of MT use in terms of various measures in syntactic complexity, accuracy, lexical complexity, and…

Descriptors: Translation, Computational Linguistics, English (Second Language), Second Language Learning

Automated Essay Scoring at Scale: A Case Study in Switzerland and Germany. TOEFL® Research Report. RR-86. ETS RR-19-12

Peer reviewed
PDF on ERIC

Download full text

Rupp, André A.; Casabianca, Jodi M.; Krüger, Maleika; Keller, Stefan; Köller, Olaf – ETS Research Report Series, 2019

In this research report, we describe the design and empirical findings for a large-scale study of essay writing ability with approximately 2,500 high school students in Germany and Switzerland on the basis of 2 tasks with 2 associated prompts, each from a standardized writing assessment whose scoring involved both human and automated components.…

Descriptors: Automation, Foreign Countries, English (Second Language), Language Tests

For a Greater Good: Bias Analysis in Writing Assessment

Peer reviewed

Direct link

Ahmadi Shirazi, Masoumeh – SAGE Open, 2019

Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters' experience, language background, and the choice of essay prompt as potential…

Descriptors: Foreign Countries, Language Tests, Test Bias, Essay Tests

Previous Page | Next Page »

Pages: 1 | 2

Li, Jiuliang	2
Lim, Gad S.	2
Abbasi, Abbas	1
Ahmadi Shirazi, Masoumeh	1
Ahmet Can Uyar	1
Ahn, Soojin	1
Allen, Laura K.	1
Ann Tai Choe	1
Armengol, Lurdes	1
Attali, Yigal	1
Bouwer, Renske	1
Bridgeman, Brent	1
Béguin, Anton	1
Casabianca, Jodi M.	1
Chung, Eun Seon	1
Cots, Josep M.	1
Crossley, Scott A.	1
Daniel Holden	1
Daniel R. Isbell	1
Davey, Tim	1
Dilek Büyükahiska	1
Eckes, Thomas	1
Enright, Mary K.	1
Fernandez, Miguel	1
Fritz, Erik	1
More ▼