Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 4 |
| Since 2007 (last 20 years) | 6 |
Descriptor
| Error of Measurement | 6 |
| Language Tests | 4 |
| Foreign Countries | 3 |
| Interrater Reliability | 3 |
| Reliability | 3 |
| Data Analysis | 2 |
| English (Second Language) | 2 |
| Evaluators | 2 |
| Generalizability Theory | 2 |
| Language Skills | 2 |
| Scores | 2 |
| More ▼ | |
Source
| Language Testing | 6 |
Author
| Lin, Chih-Kai | 2 |
| Deygers, Bart | 1 |
| Hartig, Johannes | 1 |
| Iasonas Lamprianou | 1 |
| Klinger, Thorsten | 1 |
| Li, Minzi | 1 |
| Naumann, Alexander | 1 |
| Reeta Neittaanmäki | 1 |
| Schnoor, Birger | 1 |
| Usanova, Irina | 1 |
| Van Gorp, Koen | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 6 |
| Reports - Research | 6 |
Education Level
| Elementary Secondary Education | 1 |
| Higher Education | 1 |
| Postsecondary Education | 1 |
| Secondary Education | 1 |
Audience
Location
| Finland | 1 |
| Germany | 1 |
| Netherlands | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Li, Minzi; Zhang, Xian – Language Testing, 2021
This meta-analysis explores the correlation between self-assessment (SA) and language performance. Sixty-seven studies with 97 independent samples involving more than 68,500 participants were included in our analysis. It was found that the overall correlation between SA and language performance was 0.466 (p < 0.01). Moderator analysis was…
Descriptors: Meta Analysis, Self Evaluation (Individuals), Likert Scales, Research Reports
Reeta Neittaanmäki; Iasonas Lamprianou – Language Testing, 2024
This article focuses on rater severity and consistency and their relation to different types of rater experience over a long period of time. The article is based on longitudinal data collected from 2009 to 2019 from the second language Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. The study investigated…
Descriptors: Foreign Countries, Interrater Reliability, Error of Measurement, Experience
Lin, Chih-Kai – Language Testing, 2017
Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…
Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy
Schnoor, Birger; Hartig, Johannes; Klinger, Thorsten; Naumann, Alexander; Usanova, Irina – Language Testing, 2023
Research on assessing English as a foreign language (EFL) development has been growing recently. However, empirical evidence from longitudinal analyses based on substantial samples is still needed. In such settings, tests for measuring language development must meet high standards of test quality such as validity, reliability, and objectivity, as…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Longitudinal Studies
Lin, Chih-Kai; Zhang, Jinming – Language Testing, 2014
Research on the relationship between English language proficiency standards and academic content standards serves to provide information about the extent to which English language learners (ELLs) are expected to encounter academic language use that facilitates their content learning, such as in mathematics and science. Standards-to-standards…
Descriptors: Language Proficiency, Academic Standards, Generalizability Theory, English Language Learners
Deygers, Bart; Van Gorp, Koen – Language Testing, 2015
Considering scoring validity as encompassing both reliable rating scale use and valid descriptor interpretation, this study reports on the validation of a CEFR-based scale that was co-constructed and used by novice raters. The research questions this paper wishes to answer are (a) whether it is possible to construct a CEFR-based rating scale with…
Descriptors: Rating Scales, Scoring, Validity, Interrater Reliability

Peer reviewed
Direct link
