Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 11 |
Descriptor
Error of Measurement | 11 |
Foreign Countries | 11 |
Language Tests | 11 |
English (Second Language) | 7 |
Second Language Learning | 7 |
Comparative Analysis | 5 |
Item Response Theory | 4 |
Scores | 4 |
Second Language Instruction | 4 |
Statistical Analysis | 4 |
Validity | 4 |
More ▼ |
Source
Author
Afghari, Akbar | 1 |
Beglar, David | 1 |
Deygers, Bart | 1 |
Ghafournia, Narjes | 1 |
Han, Chao | 1 |
Hartig, Johannes | 1 |
Holster, Trevor A. | 1 |
Hsieh, Mingchuan | 1 |
Karakaya, Ismail | 1 |
Klinger, Thorsten | 1 |
Kramer, Brandon | 1 |
More ▼ |
Publication Type
Journal Articles | 11 |
Reports - Research | 11 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 7 |
Postsecondary Education | 7 |
Elementary Education | 1 |
Grade 6 | 1 |
Intermediate Grades | 1 |
Secondary Education | 1 |
Audience
Location
Iran | 2 |
Japan | 2 |
Turkey | 2 |
China (Beijing) | 1 |
Europe | 1 |
Germany | 1 |
Netherlands | 1 |
Taiwan | 1 |
Thailand | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 2 |
Test of English for… | 2 |
International English… | 1 |
What Works Clearinghouse Rating
Polat, Murat – International Online Journal of Education and Teaching, 2022
Foreign language testing is a multi-dimensional phenomenon and obtaining objective and error-free scores on learners' language skills is often problematic. While assessing foreign language performance on high-stakes tests, using different testing approaches including Classical Test Theory (CTT), Generalizability Theory (GT) and/or Item Response…
Descriptors: Second Language Learning, Second Language Instruction, Item Response Theory, Language Tests
Investigating the Impact of Rater Training on Rater Errors in the Process of Assessing Writing Skill
Sata, Mehmet; Karakaya, Ismail – International Journal of Assessment Tools in Education, 2022
In the process of measuring and assessing high-level cognitive skills, interference of rater errors in measurements brings about a constant concern and low objectivity. The main purpose of this study was to investigate the impact of rater training on rater errors in the process of assessing individual performance. The study was conducted with a…
Descriptors: Evaluators, Training, Comparative Analysis, Academic Language
Schnoor, Birger; Hartig, Johannes; Klinger, Thorsten; Naumann, Alexander; Usanova, Irina – Language Testing, 2023
Research on assessing English as a foreign language (EFL) development has been growing recently. However, empirical evidence from longitudinal analyses based on substantial samples is still needed. In such settings, tests for measuring language development must meet high standards of test quality such as validity, reliability, and objectivity, as…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Longitudinal Studies
Wudthayagorn, Jirada – LEARN Journal: Language Education and Acquisition Research Network, 2018
The purpose of this study was to map the Chulalongkorn University Test of English Proficiency, or the CU-TEP, to the Common European Framework of Reference (CEFR) by employing a standard setting methodology. Thirteen experts judged 120 items of the CU-TEP using the Yes/No Angoff technique. The experts decided whether or not a borderline student at…
Descriptors: Guidelines, Rating Scales, English (Second Language), Language Tests
Han, Chao – Language Assessment Quarterly, 2016
As a property of test scores, reliability/dependability constitutes an important psychometric consideration, and it underpins the validity of measurement results. A review of interpreter certification performance tests (ICPTs) reveals that (a) although reliability/dependability checking has been recognized as an important concern, its theoretical…
Descriptors: Foreign Countries, Scores, English, Chinese
Deygers, Bart; Van Gorp, Koen – Language Testing, 2015
Considering scoring validity as encompassing both reliable rating scale use and valid descriptor interpretation, this study reports on the validation of a CEFR-based scale that was co-constructed and used by novice raters. The research questions this paper wishes to answer are (a) whether it is possible to construct a CEFR-based rating scale with…
Descriptors: Rating Scales, Scoring, Validity, Interrater Reliability
Holster, Trevor A.; Lake, J. – Language Assessment Quarterly, 2016
Stewart questioned Beglar's use of Rasch analysis of the Vocabulary Size Test (VST) and advocated the use of 3-parameter logistic item response theory (3PLIRT) on the basis that it models a non-zero lower asymptote for items, often called a "guessing" parameter. In support of this theory, Stewart presented fit statistics derived from…
Descriptors: Guessing (Tests), Item Response Theory, Vocabulary, Language Tests
Rastegar, Behnaz; Safari, Fatemeh – International Journal of Education and Literacy Studies, 2017
Language learners' productive role in teaching and learning processes has recently been the focus of attention. Therefore, this study aimed at investigating the effect of oral vs. written output-based instruction on English as a foreign language (EFL) learners' vocabulary learning with a focus on reflective vs. impulsive learning styles. To this…
Descriptors: Cognitive Style, English (Second Language), Second Language Learning, Foreign Countries
McLean, Stuart; Kramer, Brandon; Beglar, David – Language Teaching Research, 2015
An important gap in the field of second language vocabulary assessment concerns the lack of validated tests measuring aural vocabulary knowledge. The primary purpose of this study is to introduce and provide preliminary validity evidence for the Listening Vocabulary Levels Test (LVLT), which has been designed as a diagnostic tool to measure…
Descriptors: Test Construction, Test Validity, English (Second Language), Second Language Learning
Hsieh, Mingchuan – Language Assessment Quarterly, 2013
The Yes/No Angoff and Bookmark method for setting standards on educational assessment are currently two of the most popular standard-setting methods. However, there is no research into the comparability of these two methods in the context of language assessment. This study compared results from the Yes/No Angoff and Bookmark methods as applied to…
Descriptors: Standard Setting (Scoring), Comparative Analysis, Language Tests, Multiple Choice Tests
Ghafournia, Narjes; Afghari, Akbar – English Language Teaching, 2013
The study scrutinized the probable interaction between using cognitive test-taking strategies, reading proficiency, and reading comprehension test performance of Iranian postgraduate students, who studied English as a foreign language. The study also probed the extent to which the participants' test performance was related to the use of certain…
Descriptors: Foreign Countries, Reading Comprehension, Reading Tests, English (Second Language)