Publication Date
In 2025 | 2 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 9 |
Since 2016 (last 10 years) | 16 |
Since 2006 (last 20 years) | 30 |
Descriptor
Computer Assisted Testing | 31 |
Evaluators | 31 |
Language Tests | 31 |
English (Second Language) | 27 |
Second Language Learning | 27 |
Scoring | 17 |
Scores | 16 |
Language Proficiency | 14 |
Oral Language | 12 |
Computer Software | 11 |
Foreign Countries | 11 |
More ▼ |
Source
Author
Bridgeman, Brent | 2 |
Davis, Larry | 2 |
Mollaun, Pamela | 2 |
Xi, Xiaoming | 2 |
Zechner, Klaus | 2 |
Ahmet Can Uyar | 1 |
Alegre, Analucia | 1 |
Alexander James Kwako | 1 |
Attali, Yigal | 1 |
Bejar, Isaac I. | 1 |
Blanchard, Daniel | 1 |
More ▼ |
Publication Type
Journal Articles | 27 |
Reports - Research | 27 |
Tests/Questionnaires | 7 |
Dissertations/Theses -… | 3 |
Reports - Descriptive | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Higher Education | 8 |
Postsecondary Education | 7 |
Secondary Education | 3 |
High Schools | 2 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 18 |
International English… | 3 |
ACTFL Oral Proficiency… | 1 |
Foreign Language Classroom… | 1 |
Test of English for… | 1 |
What Works Clearinghouse Rating
Ahmet Can Uyar; Dilek Büyükahiska – International Journal of Assessment Tools in Education, 2025
This study explores the effectiveness of using ChatGPT, an Artificial Intelligence (AI) language model, as an Automated Essay Scoring (AES) tool for grading English as a Foreign Language (EFL) learners' essays. The corpus consists of 50 essays representing various types including analysis, compare and contrast, descriptive, narrative, and opinion…
Descriptors: Artificial Intelligence, Computer Software, Technology Uses in Education, Teaching Methods
Alexander James Kwako – ProQuest LLC, 2023
Automated assessment using Natural Language Processing (NLP) has the potential to make English speaking assessments more reliable, authentic, and accessible. Yet without careful examination, NLP may exacerbate social prejudices based on gender or native language (L1). Current NLP-based assessments are prone to such biases, yet research and…
Descriptors: Gender Bias, Natural Language Processing, Native Language, Computational Linguistics
Yuko Hayashi; Yusuke Kondo; Yutaka Ishii – Innovation in Language Learning and Teaching, 2024
Purpose: This study builds a new system for automatically assessing learners' speech elicited from an oral discourse completion task (DCT), and evaluates the prediction capability of the system with a view to better understanding factors deemed influential in predicting speaking proficiency scores and the pedagogical implications of the system.…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Japanese
Ockey, Gary J.; Chukharev-Hudilainen, Evgeny – Applied Linguistics, 2021
A challenge of large-scale oral communication assessments is to feasibly assess a broad construct that includes interactional competence. One possible approach in addressing this challenge is to use a spoken dialog system (SDS), with the computer acting as a peer to elicit a ratable speech sample. With this aim, an SDS was built and four trained…
Descriptors: Oral Language, Grammar, Language Fluency, Language Tests
Cox, Troy L.; Brown, Alan V.; Thompson, Gregory L. – Language Testing, 2023
The rating of proficiency tests that use the Inter-agency Roundtable (ILR) and American Council on the Teaching of Foreign Languages (ACTFL) guidelines claims that each major level is based on hierarchal linguistic functions that require mastery of multidimensional traits in such a way that each level subsumes the levels beneath it. These…
Descriptors: Oral Language, Language Fluency, Scoring, Cues
Eckes, Thomas; Jin, Kuan-Yu – International Journal of Testing, 2021
Severity and centrality are two main kinds of rater effects posing threats to the validity and fairness of performance assessments. Adopting Jin and Wang's (2018) extended facets modeling approach, we separately estimated the magnitude of rater severity and centrality effects in the web-based TestDaF (Test of German as a Foreign Language) writing…
Descriptors: Language Tests, German, Second Languages, Writing Tests
Xu, Jing; Jones, Edmund; Laxton, Victoria; Galaczi, Evelina – Assessment in Education: Principles, Policy & Practice, 2021
Recent advances in machine learning have made automated scoring of learner speech widespread, and yet validation research that provides support for applying automated scoring technology to assessment is still in its infancy. Both the educational measurement and language assessment communities have called for greater transparency in describing…
Descriptors: Second Language Learning, Second Language Instruction, English (Second Language), Computer Software
Karim Sadeghi; Neda Bakhshi – International Journal of Language Testing, 2025
Assessing language skills in an integrative form has drawn the attention of assessment experts in recent years. While some research data exists on integrative listening/reading-to-write assessment, there is comparatively little research literature on listening-to-speak integrated assessment. Also, little attention has been devoted to the role of…
Descriptors: Language Tests, Second Language Learning, English (Second Language), Computer Assisted Testing
Rupp, André A.; Casabianca, Jodi M.; Krüger, Maleika; Keller, Stefan; Köller, Olaf – ETS Research Report Series, 2019
In this research report, we describe the design and empirical findings for a large-scale study of essay writing ability with approximately 2,500 high school students in Germany and Switzerland on the basis of 2 tasks with 2 associated prompts, each from a standardized writing assessment whose scoring involved both human and automated components.…
Descriptors: Automation, Foreign Countries, English (Second Language), Language Tests
Gu, Lin; Davis, Larry; Tao, Jacob; Zechner, Klaus – Assessment in Education: Principles, Policy & Practice, 2021
Recent technology advancements have increased the prospects for automated spoken language technology to provide feedback on speaking performance. In this study we examined user perceptions of using an automated feedback system for preparing for the TOEFL iBT® test. Test takers and language teachers evaluated three types of machine-generated…
Descriptors: Audio Equipment, Test Preparation, Feedback (Response), Scores
Linlin, Cao – English Language Teaching, 2020
Through Many-Facet Rasch analysis, this study explores the rating differences between 1 computer automatic rater and 5 expert teacher raters on scoring 119 students in a computerized English listening-speaking test. Results indicate that both automatic and the teacher raters demonstrate good inter-rater reliability, though the automatic rater…
Descriptors: Language Tests, Computer Assisted Testing, English (Second Language), Second Language Learning
Burton, John Dylan – Language Assessment Quarterly, 2020
An assumption underlying speaking tests is that scores reflect the ability to produce online, non-rehearsed speech. Speech produced in testing situations may, however, be less spontaneous if extensive test preparation takes place, resulting in memorized or rehearsed responses. If raters detect these patterns, they may conceptualize speech as…
Descriptors: Language Tests, Oral Language, Scores, Speech Communication
Li, Xuelian – English Language Teaching, 2019
Based on the articles written by mainland Chinese scholars published in the most influential Chinese and international journals, the present article analyzed the language testing research, compared the tendencies of seven categories between 2000-2009 and 2010-2019, and put forward future research directions by referring to international hot…
Descriptors: Language Tests, Testing, Educational History, Futures (of Society)
Kang, Okim; Rubin, Don; Kermad, Alyssa – Language Testing, 2019
As a result of the fact that judgments of non-native speech are closely tied to social biases, oral proficiency ratings are susceptible to error because of rater background and social attitudes. In the present study we seek first to estimate the variance attributable to rater background and attitudinal variables on novice raters' assessments of L2…
Descriptors: Evaluators, Second Language Learning, Language Tests, English (Second Language)
Davis, Larry – Language Testing, 2016
Two factors were investigated that are thought to contribute to consistency in rater scoring judgments: rater training and experience in scoring. Also considered were the relative effects of scoring rubrics and exemplars on rater performance. Experienced teachers of English (N = 20) scored recorded responses from the TOEFL iBT speaking test prior…
Descriptors: Evaluators, Oral Language, Scores, Language Tests