Publication Date
In 2025 | 0 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 6 |
Since 2016 (last 10 years) | 15 |
Since 2006 (last 20 years) | 22 |
Descriptor
Evaluators | 23 |
Scores | 23 |
Language Tests | 17 |
Second Language Learning | 17 |
English (Second Language) | 13 |
Foreign Countries | 10 |
Language Proficiency | 8 |
Essays | 7 |
Writing Evaluation | 7 |
Comparative Analysis | 6 |
Rating Scales | 6 |
More ▼ |
Source
Language Testing | 23 |
Author
Sanders, Ted | 2 |
van den Bergh, Huub | 2 |
Albert Weideman | 1 |
Ann Tai Choe | 1 |
Attali, Yigal | 1 |
Barkaoui, Khaled | 1 |
Bart Deygers | 1 |
Bond, Trevor | 1 |
Bouwer, Renske | 1 |
Béguin, Anton | 1 |
Chalhoub-Deville, Micheline | 1 |
More ▼ |
Publication Type
Journal Articles | 23 |
Reports - Research | 22 |
Tests/Questionnaires | 2 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
International English… | 2 |
Test of English as a Foreign… | 2 |
What Works Clearinghouse Rating
Laura Schildt; Bart Deygers; Albert Weideman – Language Testing, 2024
In the context of policy-driven language testing for citizenship, a growing body of research examines the political justifications and ethical implications of language requirements and test use. However, virtually no studies have looked at the role that language testers play in the evolution of language requirements. Critical gaps remain in our…
Descriptors: Language Tests, Citizenship, Educational Policy, Assessment Literacy
Yu-Tzu Chang; Ann Tai Choe; Daniel Holden; Daniel R. Isbell – Language Testing, 2024
In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016-2021, including 434 essays) using…
Descriptors: Language Tests, Rating Scales, Second Language Learning, English (Second Language)
Shin, Jinnie; Gierl, Mark J. – Language Testing, 2021
Automated essay scoring (AES) has emerged as a secondary or as a sole marker for many high-stakes educational assessments, in native and non-native testing, owing to remarkable advances in feature engineering using natural language processing, machine learning, and deep-neural algorithms. The purpose of this study is to compare the effectiveness…
Descriptors: Scoring, Essays, Writing Evaluation, Computer Software
Chan, Kinnie Kin Yee; Bond, Trevor; Yan, Zi – Language Testing, 2023
We investigated the relationship between the scores assigned by an Automated Essay Scoring (AES) system, the Intelligent Essay Assessor (IEA), and grades allocated by trained, professional human raters to English essay writing by instigating two procedures novel to written-language assessment: the logistic transformation of AES raw scores into…
Descriptors: Computer Assisted Testing, Essays, Scoring, Scores
J. Dylan Burton – Language Testing, 2024
Nonverbal behavior can impact language proficiency scores in speaking tests, but there is little empirical information of the size or consistency of its effects or whether language proficiency may be a moderating variable. In this study, 100 novice raters watched and scored 30 recordings of test takers taking an international, high stakes…
Descriptors: Nonverbal Ability, Language Fluency, Second Language Learning, Language Proficiency
Khabbazbashi, Nahal; Galaczi, Evelina D. – Language Testing, 2020
This mixed methods study examined holistic, analytic, and part marking models (MMs) in terms of their measurement properties and impact on candidate CEFR classifications in a semi-direct online speaking test. Speaking performances of 240 candidates were first marked holistically and by part (phase 1). On the basis of phase 1 findings--which…
Descriptors: Holistic Approach, Classification, Grading, Language Tests
Ma, Wenyue – Language Testing, 2022
Second-language (L2) testing researchers have explored the relationship between speakers' overall speaking ability, reflected by holistic scores, and the speakers' performance on speaking subcomponents, reflected by analytic scores (e.g., McNamara, 1990; Sato, 2011). These research studies have advanced applied linguists' understanding of how…
Descriptors: Language Tests, Teaching Assistants, Second Language Learning, Second Language Instruction
Sahan, Özgür; Razi, Salim – Language Testing, 2020
This study examines the decision-making behaviors of raters with varying levels of experience while assessing EFL essays of distinct qualities. The data were collected from 28 raters with varying levels of rating experience and working at the English language departments of different universities in Turkey. Using a 10-point analytic rubric, each…
Descriptors: Decision Making, Essays, Writing Evaluation, Evaluators
Lin, Chih-Kai – Language Testing, 2017
Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…
Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy
Han, Chao – Language Testing, 2019
Summative assessment of interpretation is widely conducted in interpreting courses/programs to inform high-stakes decision making, such as the selection, certification, and conferral of academic degrees. Yet there has been very limited empirical research to investigate the score dependability of summative interpretation assessment. The present…
Descriptors: Generalization, Decision Making, Summative Evaluation, Evaluators
The Impact of Pre-Task Planning on Speaking Test Performance for English-Medium University Admission
O'Grady, Stefan – Language Testing, 2019
This study investigated the impact of different lengths of pre-task planning time on performance in a test of second language speaking ability for university admission. In the study, 47 Turkish-speaking learners of English took a test of English language speaking ability. The participants were divided into two groups according to their language…
Descriptors: Language of Instruction, English (Second Language), Second Language Learning, Student Placement
Trace, Jonathan; Janssen, Gerriet; Meier, Valerie – Language Testing, 2017
Previous research in second language writing has shown that when scoring performance assessments even trained raters can exhibit significant differences in severity. When raters disagree, using discussion to try to reach a consensus is one popular form of score resolution, particularly in contexts with limited resources, as it does not require…
Descriptors: Performance Based Assessment, Second Language Learning, Scoring, Evaluators
Davis, Larry – Language Testing, 2016
Two factors were investigated that are thought to contribute to consistency in rater scoring judgments: rater training and experience in scoring. Also considered were the relative effects of scoring rubrics and exemplars on rater performance. Experienced teachers of English (N = 20) scored recorded responses from the TOEFL iBT speaking test prior…
Descriptors: Evaluators, Oral Language, Scores, Language Tests
Attali, Yigal – Language Testing, 2016
A short training program for evaluating responses to an essay writing task consisted of scoring 20 training essays with immediate feedback about the correct score. The same scoring session also served as a certification test for trainees. Participants with little or no previous rating experience completed this session and 14 trainees who passed an…
Descriptors: Writing Evaluation, Writing Tests, Standardized Tests, Evaluators
Bouwer, Renske; Béguin, Anton; Sanders, Ted; van den Bergh, Huub – Language Testing, 2015
In the present study, aspects of the measurement of writing are disentangled in order to investigate the validity of inferences made on the basis of writing performance and to describe implications for the assessment of writing. To include genre as a facet in the measurement, we obtained writing scores of 12 texts in four different genres for each…
Descriptors: Writing Tests, Generalization, Scores, Writing Instruction
Previous Page | Next Page »
Pages: 1 | 2