Publication Date
In 2025 | 1 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 6 |
Since 2016 (last 10 years) | 18 |
Since 2006 (last 20 years) | 36 |
Descriptor
Source
Author
Xi, Xiaoming | 4 |
Bridgeman, Brent | 2 |
Davis, Larry | 2 |
Kang, Okim | 2 |
Kermad, Alyssa | 2 |
Mollaun, Pam | 2 |
Mollaun, Pamela | 2 |
Zechner, Klaus | 2 |
Ahmadi Shirazi, Masoumeh | 1 |
Ahmadi, Alireza | 1 |
Alegre, Analucia | 1 |
More ▼ |
Publication Type
Journal Articles | 36 |
Reports - Research | 36 |
Tests/Questionnaires | 10 |
Dissertations/Theses -… | 1 |
Numerical/Quantitative Data | 1 |
Opinion Papers | 1 |
Reports - Evaluative | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Higher Education | 14 |
Postsecondary Education | 13 |
High Schools | 1 |
Secondary Education | 1 |
Audience
Location
Iran | 4 |
Australia | 1 |
Europe | 1 |
Germany | 1 |
India | 1 |
Japan (Tokyo) | 1 |
New Zealand | 1 |
Switzerland | 1 |
Thailand | 1 |
United States | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 39 |
International English… | 5 |
Test of English for… | 3 |
Foreign Language Classroom… | 1 |
Graduate Record Examinations | 1 |
What Works Clearinghouse Rating
Michael D. Carey; Stefan Szocs – Language Testing, 2024
This controlled experimental study investigated the interaction of variables associated with rating the pronunciation component of high-stakes English-language-speaking tests such as IELTS and TOEFL iBT. One hundred experienced raters who were all either familiar or unfamiliar with Brazilian-accented English or Papua New Guinean Tok Pisin-accented…
Descriptors: Dialects, Pronunciation, Suprasegmentals, Familiarity
Chan, Sathena; May, Lyn – Language Testing, 2023
Despite the increased use of integrated tasks in high-stakes academic writing assessment, research on rating criteria which reflect the unique construct of integrated summary writing skills is comparatively rare. Using a mixed-method approach of expert judgement, text analysis, and statistical analysis, this study examines writing features that…
Descriptors: Scoring, Writing Evaluation, Reading Tests, Listening Skills
Finn, Bridgid; Arslan, Burcu; Walsh, Matthew – Applied Measurement in Education, 2020
To score an essay response, raters draw on previously trained skills and knowledge about the underlying rubric and score criterion. Cognitive processes such as remembering, forgetting, and skill decay likely influence rater performance. To investigate how forgetting influences scoring, we evaluated raters' scoring accuracy on TOEFL and GRE essays.…
Descriptors: Epistemology, Essay Tests, Evaluators, Cognitive Processes
Kermad, Alyssa; Bogorevich, Valeria – Language Teaching Research Quarterly, 2022
The practice of second language (L2) speech perception has traditionally relied on equal-interval perceptual scales and novice listeners' (NLs) impressionistic judgments of constructs such as accentedness and comprehensibility (Munro & Derwing, 2011). However, issues have surfaced with respect to how well NLs can use these scales, whether they…
Descriptors: Speech Communication, Second Language Learning, Intelligibility, Rating Scales
Pang, Alvin – RELC Journal: A Journal of Language Teaching and Research, 2019
John Read is about to retire as Professor in Applied Language Studies at the University of Auckland. He previously taught applied linguistics, Teaching English to Speakers of Other Languages (TESOL) and English for Academic Purposes (EAP) at Victoria University of Wellington, the SEAMEO Regional Language Centre, the University of Texas El Paso,…
Descriptors: Language Tests, Testing, English (Second Language), Second Language Learning
Ahmadi, Alireza – Taiwan Journal of TESOL, 2020
Rater subjectivity has long been an intriguing topic. The use of discussion as a resolution method is a practical way to reduce this subjectivity. However, the efficacy of discussion depends on whether different raters get equally engaged in it or one rater tends to dominate others. This study investigated whether and how rater dominance occurs in…
Descriptors: Evaluators, Interrater Reliability, Discussion, Discourse Analysis
Karim Sadeghi; Neda Bakhshi – International Journal of Language Testing, 2025
Assessing language skills in an integrative form has drawn the attention of assessment experts in recent years. While some research data exists on integrative listening/reading-to-write assessment, there is comparatively little research literature on listening-to-speak integrated assessment. Also, little attention has been devoted to the role of…
Descriptors: Language Tests, Second Language Learning, English (Second Language), Computer Assisted Testing
Choi, Jin Soo – Applied Language Learning, 2021
This study examined the impact of the manipulated task complexity (Robinson 2001a, 2001b, 2007, 2011; Robinson & Gilabert, 2007) on second language (L2) speech comprehensibility. I examined whether manipulated task complexity (a) impacts L2 speech comprehensibility, (b) aligns with L2 speakers' perception of task difficulty (cognitive…
Descriptors: Task Analysis, Second Language Learning, Second Language Instruction, Pronunciation
Rupp, André A.; Casabianca, Jodi M.; Krüger, Maleika; Keller, Stefan; Köller, Olaf – ETS Research Report Series, 2019
In this research report, we describe the design and empirical findings for a large-scale study of essay writing ability with approximately 2,500 high school students in Germany and Switzerland on the basis of 2 tasks with 2 associated prompts, each from a standardized writing assessment whose scoring involved both human and automated components.…
Descriptors: Automation, Foreign Countries, English (Second Language), Language Tests
Ahmadi Shirazi, Masoumeh – SAGE Open, 2019
Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters' experience, language background, and the choice of essay prompt as potential…
Descriptors: Foreign Countries, Language Tests, Test Bias, Essay Tests
Gu, Lin; Davis, Larry; Tao, Jacob; Zechner, Klaus – Assessment in Education: Principles, Policy & Practice, 2021
Recent technology advancements have increased the prospects for automated spoken language technology to provide feedback on speaking performance. In this study we examined user perceptions of using an automated feedback system for preparing for the TOEFL iBT® test. Test takers and language teachers evaluated three types of machine-generated…
Descriptors: Audio Equipment, Test Preparation, Feedback (Response), Scores
Wudthayagorn, Jirada – LEARN Journal: Language Education and Acquisition Research Network, 2018
The purpose of this study was to map the Chulalongkorn University Test of English Proficiency, or the CU-TEP, to the Common European Framework of Reference (CEFR) by employing a standard setting methodology. Thirteen experts judged 120 items of the CU-TEP using the Yes/No Angoff technique. The experts decided whether or not a borderline student at…
Descriptors: Guidelines, Rating Scales, English (Second Language), Language Tests
Ockey, Gary J.; Papageorgiou, Spiros; French, Robert – International Journal of Listening, 2016
This article reports on a study which aimed to determine the effect of strength of accent on listening comprehension of interactive lectures. Test takers (N = 21,726) listened to an interactive lecture given by one of nine speakers and responded to six comprehension items. The test taker responses were analyzed with the Rasch computer program…
Descriptors: Pronunciation, Listening Comprehension, Lecture Method, Computer Software
Kang, Okim; Rubin, Don; Kermad, Alyssa – Language Testing, 2019
As a result of the fact that judgments of non-native speech are closely tied to social biases, oral proficiency ratings are susceptible to error because of rater background and social attitudes. In the present study we seek first to estimate the variance attributable to rater background and attitudinal variables on novice raters' assessments of L2…
Descriptors: Evaluators, Second Language Learning, Language Tests, English (Second Language)
Davis, Larry – Language Testing, 2016
Two factors were investigated that are thought to contribute to consistency in rater scoring judgments: rater training and experience in scoring. Also considered were the relative effects of scoring rubrics and exemplars on rater performance. Experienced teachers of English (N = 20) scored recorded responses from the TOEFL iBT speaking test prior…
Descriptors: Evaluators, Oral Language, Scores, Language Tests