NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 404 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024
This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…
Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025
This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…
Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Anja Riemenschneider; Zarah Weiss; Pauline Schröter; Detmar Meurers – TESOL Quarterly: A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second Dialect, 2024
The linguistic characteristics of text productions depend on various factors, including individual language proficiency as well as the tasks used to elicit the production. To date, little attention has been paid to whether some writing tasks are more suitable than others to represent and differentiate students' proficiency levels. This issue is…
Descriptors: English (Second Language), Writing (Composition), Difficulty Level, Language Proficiency
Peer reviewed Peer reviewed
Direct linkDirect link
Apichat Khamboonruang – Language Testing in Asia, 2025
Chulalongkorn University Language Institute (CULI) test was developed as a local standardised test of English for professional and international communication. To ensure that the CULI test fulfils its intended purposes, this study employed Kane's argument-based validation and Rasch measurement approaches to construct the validity argument for the…
Descriptors: Universities, Second Language Learning, Second Language Instruction, Language Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Ludewig, Ulrich; Schwerter, Jakob; McElvany, Nele – Journal of Psychoeducational Assessment, 2023
A better understanding of how distractor features influence the plausibility of distractors is essential for an efficient multiple-choice (MC) item construction in educational assessment. The plausibility of distractors has a major influence on the psychometric characteristics of MC items. Our analysis utilizes the nominal categories model to…
Descriptors: Vocabulary, Language Tests, German, Grade 4
Peer reviewed Peer reviewed
Direct linkDirect link
Yu, Qiaona – Applied Linguistics, 2021
Language complexity reveals the ability to use a wide and varied range of sophisticated structures and vocabulary. Although different languages compose complexity differently, complexity measures such as the T-unit have typically been based on clause subordination, which may underrepresent complexity and threaten the validity of studies. This…
Descriptors: Chinese, Difficulty Level, Syntax, Language Proficiency
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Thirakunkovit, Suthathip; Rhee, Seongha – THAITESOL Journal, 2021
This study explores the extent to which the difficulty levels of grammar items in an English test can be predicted by the complexity of grammatical structures. The researchers carried out two sets of analyses. In the first analysis, the item facility and item discrimination indices of 175 multiple-choice items were examined. In the second…
Descriptors: Grammar, Test Items, Difficulty Level, English (Second Language)
Peer reviewed Peer reviewed
Direct linkDirect link
De Cat, Cécile; Melia, Tara – Journal of Child Language, 2022
The Sentence Structure sub-test (SST) of the Clinical Evaluation of Language Fundamentals (CELF) aims to "measure the acquisition of grammatical (structural) rules at the sentence level". Although originally designed for clinical practice with monolingual children, components of the CELF, such as the SST, are often used to inform…
Descriptors: Sentence Structure, Language Tests, Reading Comprehension, Cognitive Processes
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Alan Shaw – PASAA: Journal of Language Teaching and Learning in Thailand, 2023
Although the TOEFL iBT Listening test is sometimes used for other purposes, it was designed primarily for use as a college entrance examination. Item difficulty in TOEFL iBT Listening tests is the product of interactions between two sets of complex relationships: 1) relationships among numerous item characteristics themselves, and 2) relationships…
Descriptors: English (Second Language), Second Language Instruction, Listening Skills, Language Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Yoshiki Fujiwara; Hiroyuki Shimada – Language Acquisition: A Journal of Developmental Linguistics, 2024
The goal of this paper is to tease apart two approaches to the source of children's consistent scope assignment in negative sentences containing logical connectives: the Semantic Subset Principle and the Semantic Subset Maxim. Previous developmental work has observed that four- to six-year-old children across languages have difficulty with…
Descriptors: Semantics, Language Acquisition, Form Classes (Languages), Morphemes
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Noboru Sakai – Journal of Educators Online, 2025
This study aims to investigate ChatGPT's ability to comprehend input from nonnative speakers, specifically those learning English as a second language, with Japanese speakers serving as the model population. The experiment examines how ChatGPT evaluates the difficulty levels of the Test of English for International Communication (TOEIC), which is…
Descriptors: Foreign Countries, Artificial Intelligence, Native Speakers, English (Second Language)
Peer reviewed Peer reviewed
Direct linkDirect link
Kuo-Zheng Feng – Language Testing in Asia, 2024
This study addressed a gap in existing research on Multiple-Choice (MC) cloze tests by focusing on the learners' perspective, specifically examining the difficulties faced by vocational high school students (VHSs). A nationwide sample of 293 VHSs participated, providing both quantitative and qualitative data through a self-developed questionnaire.…
Descriptors: Language Tests, Multiple Choice Tests, Cloze Procedure, Student Attitudes
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025
The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…
Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction
Peer reviewed Peer reviewed
Direct linkDirect link
Jin, Kuan-Yu; Eckes, Thomas – Measurement: Interdisciplinary Research and Perspectives, 2022
Recent research on rater effects in performance assessments has increasingly focused on rater centrality, the tendency to assign scores clustering around the rating scale's middle categories. In the present paper, we adopted Jin and Wang's (2018) extended facets modeling approach and constructed a centrality continuum, ranging from raters…
Descriptors: Performance Based Assessment, Evaluators, Scoring, Sample Size
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ali Akbar Boori; Mohammad Ghazanfari; Behzad Ghonsooly; Purya Baghaei – International Journal of Language Testing, 2024
The purpose of this study was to compare the functioning of five restrictive CDMs, including DINA, DINO, A-CDM, LLM, and RRUM, against the G-DINA model to identify the best-fitting CDM which can better explain the interaction underlying the attributes of the reading comprehension section of an Iranian high-stakes language proficiency test. To this…
Descriptors: Foreign Countries, Doctoral Students, Reading Comprehension, Language Tests
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  27