Publication Date
In 2025 | 2 |
Since 2024 | 4 |
Since 2021 (last 5 years) | 10 |
Since 2016 (last 10 years) | 13 |
Since 2006 (last 20 years) | 16 |
Descriptor
Language Proficiency | 31 |
Test Items | 31 |
Test Reliability | 31 |
Language Tests | 26 |
English (Second Language) | 23 |
Test Validity | 20 |
Second Language Learning | 16 |
Foreign Countries | 15 |
Test Construction | 15 |
Scores | 8 |
Test Format | 8 |
More ▼ |
Source
Author
Stansfield, Charles W. | 2 |
Ahmadi, Alireza | 1 |
Ali Zahabi | 1 |
Ammaralikit, Amornrat | 1 |
Arth, Thomas O. | 1 |
Aviad-Levitzky, Tami | 1 |
Boldt, R. F. | 1 |
Budi Waluyo | 1 |
Cardoso, Rosana M. F. | 1 |
Changkyung Song | 1 |
Chapelle, Carol A. | 1 |
More ▼ |
Publication Type
Education Level
Higher Education | 7 |
Postsecondary Education | 6 |
Audience
Practitioners | 2 |
Administrators | 1 |
Researchers | 1 |
Teachers | 1 |
Location
Iran | 3 |
Thailand | 3 |
Australia | 2 |
Europe | 2 |
Brazil | 1 |
Israel | 1 |
Japan | 1 |
Netherlands | 1 |
Sudan | 1 |
Turkey | 1 |
United Kingdom | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 2 |
Test of English for… | 2 |
What Works Clearinghouse Rating
Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025
This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…
Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests
Hojung Kim; Changkyung Song; Jiyoung Kim; Hyeyun Jeong; Jisoo Park – Language Testing in Asia, 2024
This study presents a modified version of the Korean Elicited Imitation (EI) test, designed to resemble natural spoken language, and validates its reliability as a measure of proficiency. The study assesses the correlation between average test scores and Test of Proficiency in Korean (TOPIK) levels, examining score distributions among beginner,…
Descriptors: Korean, Test Validity, Test Reliability, Imitation
Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025
This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…
Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction
Rafatbakhsh, Elaheh; Ahmadi, Alireza – Practical Assessment, Research & Evaluation, 2022
The purpose of this study was to investigate the validity of the vocabulary subsection of a high-stakes university entrance exam for Ph.D. programs using the argument-based approach. All the three different versions of the test administered in a period of five years and the responses of 12,500 test-takers were studied. The study focused on four…
Descriptors: Vocabulary, College Entrance Examinations, Doctoral Programs, Test Validity
Dhyaaldian, Safa Mohammed Abdulridah; Kadhim, Qasim Khlaif; Mutlak, Dhameer A.; Neamah, Nour Raheem; Kareem, Zaidoon Hussein; Hamad, Doaa A.; Tuama, Jassim Hassan; Qasim, Mohammed Saad – International Journal of Language Testing, 2022
A C-Test is a gap-filling test for measuring language competence in the first and second language. C-Tests are usually analyzed with polytomous Rasch models by considering each passage as a super-item or testlet. This strategy helps overcome the local dependence inherent in C-Test gaps. However, there is little research on the best polytomous…
Descriptors: Item Response Theory, Cloze Procedure, Reading Tests, Language Tests
Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022
The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…
Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency
Budi Waluyo; Ali Zahabi; Luksika Ruangsung – rEFLections, 2024
The increasing popularity of the Common European Framework of Reference (CEFR) in non-native English-speaking countries has generated a demand for concrete examples in the creation of CEFR-based tests that assess the four main English skills. In response, this research endeavors to provide insight into the development and validation of a…
Descriptors: Language Tests, Language Proficiency, Undergraduate Students, Language Skills
Cheewasukthaworn, Kanchana – PASAA: Journal of Language Teaching and Learning in Thailand, 2022
In 2016, the Office of the Higher Education Commission issued a directive requiring all higher education institutions in Thailand to have their students take a standardized English proficiency test. According to the directive, the test's results had to align with the Common European Framework of Reference for Languages (CEFR). In response to this…
Descriptors: Test Construction, Standardized Tests, Language Tests, English (Second Language)
Ji-young Shin – ProQuest LLC, 2021
The present dissertation investigated the impact of scales/scoring methods and prompt linguistic features on the measurement quality of L2 English elicited imitation (EI). Scales/scoring methods are an important feature for the validity and reliability of L2 EI test, but less is known (Yan et al., 2016). Prompt linguistic features are also known…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Semantics
Sridhanyarat, Kietnawin; Pathong, Supakarn; Suranakkharin, Todsapon; Ammaralikit, Amornrat – English Language Teaching, 2021
This study aimed at developing the Silpakorn Test of English Proficiency (STEP), in alignment with the Common European Framework of Reference for Languages (CEFR), and in accordance with the theoretical framework established by Alderson et al. (2006). Four major steps were involved in the test construction. First, English language lecturers who…
Descriptors: Language Tests, Language Proficiency, Second Language Learning, Second Language Instruction
Adding Value to Second-Language Listening and Reading Subscores: Using a Score Augmentation Approach
Papageorgiou, Spiros; Choi, Ikkyu – International Journal of Testing, 2018
This study examined whether reporting subscores for groups of items within a test section assessing a second-language modality (specifically reading or listening comprehension) added value from a measurement perspective to the information already provided by the section scores. We analyzed the responses of 116,489 test takers to reading and…
Descriptors: Second Language Learning, Second Language Instruction, English (Second Language), Language Tests
Aviad-Levitzky, Tami; Laufer, Batia; Goldstein, Zahava – Language Assessment Quarterly, 2019
This article describes the development and validation of the new CATSS (Computer Adaptive Test of Size and Strength), which measures vocabulary knowledge in four modalities -- productive recall, receptive recall, productive recognition, and receptive recognition. In the first part of the paper we present the assumptions that underlie the test --…
Descriptors: Foreign Countries, Test Construction, Test Validity, Test Reliability
Roche, Thomas; Harrington, Michael – Journal of Further and Higher Education, 2018
English language programmes provide established pathways for international students seeking university admission in countries such as Australia and the United Kingdom. In order to refer international applicants to appropriate levels and durations of English language support prior to matriculation into their main course of study, pathway providers…
Descriptors: Student Placement, College Admission, College Students, Foreign Students
Teker, Gulsen Tasdelen; Dogan, Nuri – Educational Sciences: Theory and Practice, 2015
Reliability and differential item functioning (DIF) analyses were conducted on testlets displaying local item dependence in this study. The data set employed in the research was obtained from the answers given by 1,500 students to the 20 items included in six testlets given in English Proficiency Exam by the School of Foreign Languages of a state…
Descriptors: Foreign Countries, Test Items, Test Bias, Item Response Theory
Chapelle, Carol A.; Chung, Yoo-Ree; Hegelheimer, Volker; Pendar, Nick; Xu, Jing – Language Testing, 2010
This study piloted test items that will be used in a computer-delivered and scored test of productive grammatical ability in English as a second language (ESL). Findings from research on learners' development of morphosyntactic, syntactic, and functional knowledge were synthesized to create a framework of grammatical features. We outline the…
Descriptors: Test Items, Grammar, Developmental Stages, Computer Assisted Testing