NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 41 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Ping-Lin Chuang – Language Testing, 2025
This experimental study explores how source use features impact raters' judgment of argumentation in a second language (L2) integrated writing test. One hundred four experienced and novice raters were recruited to complete a rating task that simulated the scoring assignment of a local English Placement Test (EPT). Sixty written responses were…
Descriptors: Interrater Reliability, Evaluators, Information Sources, Primary Sources
Peer reviewed Peer reviewed
Direct linkDirect link
Fatih Yavuz; Özgür Çelik; Gamze Yavas Çelik – British Journal of Educational Technology, 2025
This study investigates the validity and reliability of generative large language models (LLMs), specifically ChatGPT and Google's Bard, in grading student essays in higher education based on an analytical grading rubric. A total of 15 experienced English as a foreign language (EFL) instructors and two LLMs were asked to evaluate three student…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Computational Linguistics
Peer reviewed Peer reviewed
Direct linkDirect link
Lian Li; Jiehui Hu; Yu Dai; Ping Zhou; Wanhong Zhang – Reading & Writing Quarterly, 2024
This paper proposes to use depth perception to represent raters' decision in holistic evaluation of ESL essays, as an alternative medium to conventional form of numerical scores. The researchers verified the new method's accuracy and inter/intra-rater reliability by inviting 24 ESL teachers to perform different representations when rating 60…
Descriptors: Essays, Holistic Approach, Writing Evaluation, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Li, Wentao – Reading and Writing: An Interdisciplinary Journal, 2022
Scoring rubrics are known to be effective for assessing writing for both testing and classroom teaching purposes. How raters interpret the descriptors in a rubric can significantly impact the subsequent final score, and further, the descriptors may also color a rater's judgment of a student's writing quality. Little is known, however, about how…
Descriptors: Scoring Rubrics, Interrater Reliability, Writing Evaluation, Teaching Methods
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Vasfiye Geçkin; Ebru Kiziltas; Çagatay Çinar – Journal of Educational Technology and Online Learning, 2023
The quality of writing in a second language (L2) is one of the indicators of the level of proficiency for many college students to be eligible for departmental studies. Although certain software programs, such as Intelligent Essay Assessor or IntelliMetric, have been introduced to evaluate second-language writing quality, an overall assessment of…
Descriptors: Writing Evaluation, Second Language Learning, Second Language Instruction, Language Proficiency
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Apichat Khamboonruang – PASAA: Journal of Language Teaching and Learning in Thailand, 2023
Differential rater severity (DRS), one prevalent case of differential rater functioning (aka rater bias or rater interaction) effects, manifests itself when a rater assigns unusually severe or lenient ratings, threatening the validity and fairness of rater-mediated assessment. Building on a many-facets Rasch measurement (MFRM) approach, this study…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Scoring Rubrics
Peer reviewed Peer reviewed
Direct linkDirect link
Tsunemoto, Aki; Trofimovich, Pavel; Blanchet, Josée; Bertrand, Juliane; Kennedy, Sara – Foreign Language Annals, 2022
This study examined the effect of benchmarking and peer-assessment activities on second language (L2) French learners' self-assessments of accentedness, comprehensibility, and fluency. The learners, who included 25 L2 French students enrolled in a 15-week university-level French course, recorded two oral presentations at the beginning and the end…
Descriptors: Benchmarking, French, Self Evaluation (Individuals), Second Language Learning
Peer reviewed Peer reviewed
Direct linkDirect link
Heidari, Nasim; Ghanbari, Nasim; Abbasi, Abbas – Language Testing in Asia, 2022
It is widely believed that human rating performance is influenced by an array of different factors. Among these, rater-related variables such as experience, language background, perceptions, and attitudes have been mentioned. One of the important rater-related factors is the way the raters interact with the rating scales. In particular, how raters…
Descriptors: Evaluators, Rating Scales, Language Tests, English (Second Language)
Peer reviewed Peer reviewed
Direct linkDirect link
Nakatsuhara, Fumiyo; Inoue, Chihiro; Taylor, Lynda – Language Assessment Quarterly, 2021
This mixed methods study compared IELTS examiners' scores when assessing spoken performances under live and two 'non-live' testing conditions using audio and video recordings. Six IELTS examiners assessed 36 test-takers' performances under the live, audio, and video rating conditions. Scores in the three rating modes were calibrated using the…
Descriptors: Video Technology, Audio Equipment, English (Second Language), Language Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kilinc, Kardelen; Yildirim, Ozgur – World Journal of Education, 2020
The present study aims to reveal the effects of test type, pronunciation and proficiency levels of the students on speaking test scores. A total of 147 Turkish EFL students consisting of 38 beginner, 36 elementary, 37 pre-intermediate and 36 intermediate levels participated in the study. Presentation as planned, and paired speaking test as…
Descriptors: Test Format, Pronunciation, Scores, Language Proficiency
Peer reviewed Peer reviewed
Direct linkDirect link
May, Lyn; Nakatsuhara, Fumiyo; Lam, Daniel; Galaczi, Evelina – Language Testing, 2020
In this paper we report on a project in which we developed tools to support the classroom assessment of learners' interactional competence (IC) and provided learning oriented feedback in the context of preparation for a high-stakes face-to-face speaking test. Six trained examiners provided stimulated verbal reports (n = 72) on 12 paired…
Descriptors: Intercultural Communication, High Stakes Tests, Feedback (Response), Evaluators
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Han, Qie – Working Papers in TESOL & Applied Linguistics, 2016
This literature review attempts to survey representative studies within the context of L2 speaking assessment that have contributed to the conceptualization of rater cognition. Two types of studies are looked at: 1) studies that examine "how" raters differ (and sometimes agree) in their cognitive processes and rating behaviors, in terms…
Descriptors: Second Language Learning, Student Evaluation, Evaluators, Speech Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Han, Chao – Language Testing, 2019
Summative assessment of interpretation is widely conducted in interpreting courses/programs to inform high-stakes decision making, such as the selection, certification, and conferral of academic degrees. Yet there has been very limited empirical research to investigate the score dependability of summative interpretation assessment. The present…
Descriptors: Generalization, Decision Making, Summative Evaluation, Evaluators
Peer reviewed Peer reviewed
Direct linkDirect link
Burton, John Dylan – Language Assessment Quarterly, 2020
An assumption underlying speaking tests is that scores reflect the ability to produce online, non-rehearsed speech. Speech produced in testing situations may, however, be less spontaneous if extensive test preparation takes place, resulting in memorized or rehearsed responses. If raters detect these patterns, they may conceptualize speech as…
Descriptors: Language Tests, Oral Language, Scores, Speech Communication
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Yalçin-Çolakoglu, Özlem; Selçuk, Merve – Advances in Language and Literary Studies, 2019
Criterion referenced tests of second language speaking performance are administered in different institutions using different procedures. The present study reports raters' practices of second language speaking tests, in particular the correspondence between test-takers' grades when assessed individually and in groups. Data derived from…
Descriptors: Oral Language, Language Tests, Test Validity, Inferences
Previous Page | Next Page »
Pages: 1  |  2  |  3