Publication Date
In 2025 | 1 |
Since 2024 | 5 |
Since 2021 (last 5 years) | 12 |
Since 2016 (last 10 years) | 19 |
Since 2006 (last 20 years) | 30 |
Descriptor
Evaluators | 30 |
Language Tests | 30 |
Writing Evaluation | 30 |
Second Language Learning | 26 |
English (Second Language) | 24 |
Essays | 14 |
Foreign Countries | 13 |
Writing Tests | 11 |
Language Proficiency | 10 |
Correlation | 9 |
Rating Scales | 9 |
More ▼ |
Source
Author
Li, Jiuliang | 2 |
Lim, Gad S. | 2 |
Abbasi, Abbas | 1 |
Ahmadi Shirazi, Masoumeh | 1 |
Ahmet Can Uyar | 1 |
Ahn, Soojin | 1 |
Allen, Laura K. | 1 |
Ann Tai Choe | 1 |
Armengol, Lurdes | 1 |
Attali, Yigal | 1 |
Bouwer, Renske | 1 |
More ▼ |
Publication Type
Journal Articles | 30 |
Reports - Research | 29 |
Tests/Questionnaires | 5 |
Information Analyses | 1 |
Reports - Descriptive | 1 |
Education Level
Higher Education | 10 |
Postsecondary Education | 7 |
Secondary Education | 4 |
High Schools | 3 |
Elementary Education | 1 |
Grade 12 | 1 |
Audience
Location
China | 3 |
Turkey | 2 |
Belgium | 1 |
Europe | 1 |
Germany | 1 |
Hawaii | 1 |
Iran | 1 |
Japan | 1 |
Netherlands | 1 |
Pakistan | 1 |
South Korea | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
International English… | 7 |
Test of English as a Foreign… | 5 |
What Works Clearinghouse Rating
Huiying Cai; Xun Yan – Language Testing, 2024
Rater comments tend to be qualitatively analyzed to indicate raters' application of rating scales. This study applied natural language processing (NLP) techniques to quantify meaningful, behavioral information from a corpus of rater comments and triangulated that information with a many-facet Rasch measurement (MFRM) analysis of rater scores. The…
Descriptors: Natural Language Processing, Item Response Theory, Rating Scales, Writing Evaluation
Takanori Sato – Language Testing, 2024
Assessing the content of learners' compositions is a common practice in second language (L2) writing assessment. However, the construct definition of content in L2 writing assessment potentially underrepresents the target competence in content and language integrated learning (CLIL), which aims to foster not only L2 proficiency but also critical…
Descriptors: Language Tests, Content and Language Integrated Learning, Writing Evaluation, Writing Tests
Yu-Tzu Chang; Ann Tai Choe; Daniel Holden; Daniel R. Isbell – Language Testing, 2024
In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016-2021, including 434 essays) using…
Descriptors: Language Tests, Rating Scales, Second Language Learning, English (Second Language)
Jia, Wenfeng; Zhang, Peixin – Language Testing in Asia, 2023
It is widely believed that raters' cognition is an important aspect of writing assessment, as it has both logical and temporal priority over scores. Based on a critical review of previous research in this area, it is found that raters' cognition can be boiled to two fundamental issues: building text images and strategies for articulating scores.…
Descriptors: Problem Solving, Cognitive Processes, Writing Evaluation, Evaluators
Li, Jiuliang; Wang, Qian – Asian-Pacific Journal of Second and Foreign Language Education, 2021
Summary writing is essential for academic success, and has attracted renewed interest in academic research and large-scale language test. However, less attention has been paid to the development and evaluation of the scoring scales of summary writing. This study reports on the validation of a summary rubric that represented an approach to scale…
Descriptors: Validity, Rating Scales, Writing Skills, Writing Evaluation
Osama Koraishi – Language Teaching Research Quarterly, 2024
This study conducts a comprehensive quantitative evaluation of OpenAI's language model, ChatGPT 4, for grading Task 2 writing of the IELTS exam. The objective is to assess the alignment between ChatGPT's grading and that of official human raters. The analysis encompassed a multifaceted approach, including a comparison of means and reliability…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Artificial Intelligence
Ahmet Can Uyar; Dilek Büyükahiska – International Journal of Assessment Tools in Education, 2025
This study explores the effectiveness of using ChatGPT, an Artificial Intelligence (AI) language model, as an Automated Essay Scoring (AES) tool for grading English as a Foreign Language (EFL) learners' essays. The corpus consists of 50 essays representing various types including analysis, compare and contrast, descriptive, narrative, and opinion…
Descriptors: Artificial Intelligence, Computer Software, Technology Uses in Education, Teaching Methods
Wind, Stefanie A. – Language Testing, 2023
Researchers frequently evaluate rater judgments in performance assessments for evidence of differential rater functioning (DRF), which occurs when rater severity is systematically related to construct-irrelevant student characteristics after controlling for student achievement levels. However, researchers have observed that methods for detecting…
Descriptors: Evaluators, Decision Making, Student Characteristics, Performance Based Assessment
Heidari, Nasim; Ghanbari, Nasim; Abbasi, Abbas – Language Testing in Asia, 2022
It is widely believed that human rating performance is influenced by an array of different factors. Among these, rater-related variables such as experience, language background, perceptions, and attitudes have been mentioned. One of the important rater-related factors is the way the raters interact with the rating scales. In particular, how raters…
Descriptors: Evaluators, Rating Scales, Language Tests, English (Second Language)
Investigating the Impact of Rater Training on Rater Errors in the Process of Assessing Writing Skill
Sata, Mehmet; Karakaya, Ismail – International Journal of Assessment Tools in Education, 2022
In the process of measuring and assessing high-level cognitive skills, interference of rater errors in measurements brings about a constant concern and low objectivity. The main purpose of this study was to investigate the impact of rater training on rater errors in the process of assessing individual performance. The study was conducted with a…
Descriptors: Evaluators, Training, Comparative Analysis, Academic Language
Pearson, William S. – Language Testing in Asia, 2019
It is becoming increasingly important for individuals for whom English is a second language to demonstrate their linguistic credentials for academic, work and employment purposes. One option is to undertake International English Language Testing System (IELTS), which involves attempting to meet the linguistic entrance criteria set by a gatekeeping…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Cutting Scores
Wu, Xuefeng – English Language Teaching, 2022
Rating scales for writing assessment are critical in that they determine directly the quality and fairness of such performance tests. However, in many EFL contexts, rating scales are made, to certain extent, based on the intuition of teachers who strongly need a feasible and scientific route to guide their construction of rating scales. This study…
Descriptors: Writing Evaluation, Rating Scales, Second Language Learning, Second Language Instruction
Chung, Eun Seon; Ahn, Soojin – Computer Assisted Language Learning, 2022
Many studies that have investigated the educational value of online machine translation (MT) in second language (L2) writing generally report significant improvements after MT use, but no study as of yet has comprehensively analyzed the effectiveness of MT use in terms of various measures in syntactic complexity, accuracy, lexical complexity, and…
Descriptors: Translation, Computational Linguistics, English (Second Language), Second Language Learning
Rupp, André A.; Casabianca, Jodi M.; Krüger, Maleika; Keller, Stefan; Köller, Olaf – ETS Research Report Series, 2019
In this research report, we describe the design and empirical findings for a large-scale study of essay writing ability with approximately 2,500 high school students in Germany and Switzerland on the basis of 2 tasks with 2 associated prompts, each from a standardized writing assessment whose scoring involved both human and automated components.…
Descriptors: Automation, Foreign Countries, English (Second Language), Language Tests
Ahmadi Shirazi, Masoumeh – SAGE Open, 2019
Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters' experience, language background, and the choice of essay prompt as potential…
Descriptors: Foreign Countries, Language Tests, Test Bias, Essay Tests
Previous Page | Next Page »
Pages: 1 | 2