Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 5 |
Descriptor
Source
Language Testing | 5 |
Author
Deygers, Bart | 1 |
Hartig, Johannes | 1 |
He, Lianzhen | 1 |
Iasonas Lamprianou | 1 |
Klinger, Thorsten | 1 |
Min, Shangchao | 1 |
Naumann, Alexander | 1 |
Reeta Neittaanmäki | 1 |
Schnoor, Birger | 1 |
Suzuki, Yuichi | 1 |
Usanova, Irina | 1 |
More ▼ |
Publication Type
Journal Articles | 5 |
Reports - Research | 5 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 2 |
Postsecondary Education | 2 |
Secondary Education | 1 |
Audience
Location
China | 1 |
Finland | 1 |
Germany | 1 |
Japan | 1 |
Netherlands | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Reeta Neittaanmäki; Iasonas Lamprianou – Language Testing, 2024
This article focuses on rater severity and consistency and their relation to different types of rater experience over a long period of time. The article is based on longitudinal data collected from 2009 to 2019 from the second language Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. The study investigated…
Descriptors: Foreign Countries, Interrater Reliability, Error of Measurement, Experience
Schnoor, Birger; Hartig, Johannes; Klinger, Thorsten; Naumann, Alexander; Usanova, Irina – Language Testing, 2023
Research on assessing English as a foreign language (EFL) development has been growing recently. However, empirical evidence from longitudinal analyses based on substantial samples is still needed. In such settings, tests for measuring language development must meet high standards of test quality such as validity, reliability, and objectivity, as…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Longitudinal Studies
Deygers, Bart; Van Gorp, Koen – Language Testing, 2015
Considering scoring validity as encompassing both reliable rating scale use and valid descriptor interpretation, this study reports on the validation of a CEFR-based scale that was co-constructed and used by novice raters. The research questions this paper wishes to answer are (a) whether it is possible to construct a CEFR-based rating scale with…
Descriptors: Rating Scales, Scoring, Validity, Interrater Reliability
Min, Shangchao; He, Lianzhen – Language Testing, 2014
This study examined the relative effectiveness of the multidimensional bi-factor model and multidimensional testlet response theory (TRT) model in accommodating local dependence in testlet-based reading assessment with both dichotomously and polytomously scored items. The data used were 14,089 test-takers' item-level responses to the testlet-based…
Descriptors: Foreign Countries, Item Response Theory, Reading Tests, Test Items
Suzuki, Yuichi – Language Testing, 2015
Self-assessment has been used to assess second language proficiency; however, as sources of measurement errors vary, they may threaten the validity and reliability of the tools. The present paper investigated the role of experiences in using Japanese as a second language in the naturalistic acquisition context on the accuracy of the…
Descriptors: Self Evaluation (Individuals), Error of Measurement, Japanese, Second Language Learning