Publication Date
In 2025 | 2 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 6 |
Since 2016 (last 10 years) | 12 |
Since 2006 (last 20 years) | 18 |
Descriptor
Source
Language Testing | 19 |
Author
Attali, Yigal | 1 |
Audeoud, Mireille | 1 |
August, Diane | 1 |
Batty, Aaron Olaf | 1 |
Brunfaut, Tineke | 1 |
Carey, Michael D. | 1 |
Carlo, Maria | 1 |
Chan, Stephanie W. Y. | 1 |
Chapelle, Carol A. | 1 |
Cheung, Wai Ming | 1 |
Chung, Yoo-Ree | 1 |
More ▼ |
Publication Type
Journal Articles | 19 |
Reports - Research | 17 |
Reports - Descriptive | 1 |
Reports - Evaluative | 1 |
Education Level
Secondary Education | 4 |
Elementary Education | 3 |
Higher Education | 2 |
Postsecondary Education | 2 |
Early Childhood Education | 1 |
High Schools | 1 |
Junior High Schools | 1 |
Kindergarten | 1 |
Middle Schools | 1 |
Primary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Graduate Record Examinations | 1 |
Peabody Picture Vocabulary… | 1 |
What Works Clearinghouse Rating
Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025
Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…
Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction
Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025
This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…
Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction
Lestari, Santi B.; Brunfaut, Tineke – Language Testing, 2023
Assessing integrated reading-into-writing task performances is known to be challenging, and analytic rating scales have been found to better facilitate the scoring of these performances than other common types of rating scales. However, little is known about how specific operationalizations of the reading-into-writing construct in analytic rating…
Descriptors: Reading Writing Relationship, Writing Tests, Rating Scales, Writing Processes
Shin, Jinnie; Gierl, Mark J. – Language Testing, 2021
Automated essay scoring (AES) has emerged as a secondary or as a sole marker for many high-stakes educational assessments, in native and non-native testing, owing to remarkable advances in feature engineering using natural language processing, machine learning, and deep-neural algorithms. The purpose of this study is to compare the effectiveness…
Descriptors: Scoring, Essays, Writing Evaluation, Computer Software
Latifi, Syed; Gierl, Mark – Language Testing, 2021
An automated essay scoring (AES) program is a software system that uses techniques from corpus and computational linguistics and machine learning to grade essays. In this study, we aimed to describe and evaluate particular language features of Coh-Metrix for a novel AES program that would score junior and senior high school students' essays from…
Descriptors: Writing Evaluation, Computer Assisted Testing, Scoring, Essays
Olson, Daniel J. – Language Testing, 2023
Measuring language dominance, broadly defined as the relative strength of each of a bilingual's two languages, remains a crucial methodological issue in bilingualism research. While various methods have been proposed, the Bilingual Language Profile (BLP) has been one of the most widely used tools for measuring language dominance. While previous…
Descriptors: Bilingualism, Language Dominance, Native Language, Second Language Learning
Lin, Chih-Kai – Language Testing, 2017
Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…
Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy
Wang, Zhen; Zechner, Klaus; Sun, Yu – Language Testing, 2018
As automated scoring systems for spoken responses are increasingly used in language assessments, testing organizations need to analyze their performance, as compared to human raters, across several dimensions, for example, on individual items or based on subgroups of test takers. In addition, there is a need in testing organizations to establish…
Descriptors: Automation, Scoring, Speech Tests, Language Tests
Haug, Tobias; Batty, Aaron Olaf; Venetz, Martin; Notter, Christa; Girard-Groeber, Simone; Knoch, Ute; Audeoud, Mireille – Language Testing, 2020
In this study we seek evidence of validity according to the socio-cognitive framework (Weir, 2005) for a new sentence repetition test (SRT) for young Deaf L1 Swiss German Sign Language (DSGS) users. SRTs have been developed for various purposes for both spoken and sign languages to assess language development in children. In order to address the…
Descriptors: Foreign Countries, Language Tests, Sentences, Repetition
Kleijn, Suzanne; Pander Maat, Henk; Sanders, Ted – Language Testing, 2019
Although there are many methods available for assessing text comprehension, the cloze test is not widely acknowledged as one of them. Critiques on cloze testing center on its supposedly limited ability to measure comprehension beyond the sentence. However, these critiques do not hold for all types of cloze tests; the particular configuration of a…
Descriptors: Cloze Procedure, Language Tests, Semantics, Scoring
Chan, Stephanie W. Y.; Cheung, Wai Ming; Huang, Yanli; Lam, Wai-Ip; Lin, Chin-Hsi – Language Testing, 2020
Demand for second-language (L2) Chinese education for kindergarteners has grown rapidly, but little is known about these kindergarteners' L2 skills, with existing studies focusing on school-age populations and alphabetic languages. Accordingly, we developed a six-subtest Chinese character acquisition assessment to measure L2 kindergarteners'…
Descriptors: Chinese, Second Language Learning, Second Language Instruction, Written Language
Attali, Yigal; Lewis, Will; Steier, Michael – Language Testing, 2013
Automated essay scoring can produce reliable scores that are highly correlated with human scores, but is limited in its evaluation of content and other higher-order aspects of writing. The increased use of automated essay scoring in high-stakes testing underscores the need for human scoring that is focused on higher-order aspects of writing. This…
Descriptors: Scoring, Essay Tests, Reliability, High Stakes Tests
Trace, Jonathan; Janssen, Gerriet; Meier, Valerie – Language Testing, 2017
Previous research in second language writing has shown that when scoring performance assessments even trained raters can exhibit significant differences in severity. When raters disagree, using discussion to try to reach a consensus is one popular form of score resolution, particularly in contexts with limited resources, as it does not require…
Descriptors: Performance Based Assessment, Second Language Learning, Scoring, Evaluators
Deygers, Bart; Van Gorp, Koen – Language Testing, 2015
Considering scoring validity as encompassing both reliable rating scale use and valid descriptor interpretation, this study reports on the validation of a CEFR-based scale that was co-constructed and used by novice raters. The research questions this paper wishes to answer are (a) whether it is possible to construct a CEFR-based rating scale with…
Descriptors: Rating Scales, Scoring, Validity, Interrater Reliability
Katzenberger, Irit; Meilijson, Sara – Language Testing, 2014
The Katzenberger Hebrew Language Assessment for Preschool Children (henceforth: the KHLA) is the first comprehensive, standardized language assessment tool developed in Hebrew specifically for older preschoolers (4;0-5;11 years). The KHLA is a norm-referenced, Hebrew specific assessment, based on well-established psycholinguistic principles, as…
Descriptors: Semitic Languages, Preschool Children, Language Impairments, Language Tests
Previous Page | Next Page ยป
Pages: 1 | 2