NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 42 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Jiawei Xiong; George Engelhard; Allan S. Cohen – Measurement: Interdisciplinary Research and Perspectives, 2025
It is common to find mixed-format data results from the use of both multiple-choice (MC) and constructed-response (CR) questions on assessments. Dealing with these mixed response types involves understanding what the assessment is measuring, and the use of suitable measurement models to estimate latent abilities. Past research in educational…
Descriptors: Responses, Test Items, Test Format, Grade 8
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Alallo, Hajir Mahmood Ibrahim; Mohammed, Aisha; Hamid, Zayad Khalaf; Hassan, Aalaa Yaseen; Kadhim, Qasim Khlaif – International Journal of Language Testing, 2023
Diagnostic classification models (DCMs) have recently become very popular both for research purposes and for real testing endeavors for student assessment. A plethora of DCM models give researchers and practitioners a wide range of options for student diagnosis and classification. One intriguing option that some DCM models offer is the possibility…
Descriptors: Language Tests, Diagnostic Tests, Classification, Clinical Diagnosis
Peer reviewed Peer reviewed
Direct linkDirect link
Alpizar, David; Li, Tongyun; Norris, John M.; Gu, Lixiong – Language Testing, 2023
The C-test is a type of gap-filling test designed to efficiently measure second language proficiency. The typical C-test consists of several short paragraphs with the second half of every second word deleted. The words with deleted parts are considered as items nested within the corresponding paragraph. Given this testlet structure, it is commonly…
Descriptors: Psychometrics, Language Tests, Second Language Learning, Test Items
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Dhyaaldian, Safa Mohammed Abdulridah; Kadhim, Qasim Khlaif; Mutlak, Dhameer A.; Neamah, Nour Raheem; Kareem, Zaidoon Hussein; Hamad, Doaa A.; Tuama, Jassim Hassan; Qasim, Mohammed Saad – International Journal of Language Testing, 2022
A C-Test is a gap-filling test for measuring language competence in the first and second language. C-Tests are usually analyzed with polytomous Rasch models by considering each passage as a super-item or testlet. This strategy helps overcome the local dependence inherent in C-Test gaps. However, there is little research on the best polytomous…
Descriptors: Item Response Theory, Cloze Procedure, Reading Tests, Language Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Chung, Seungwon; Cai, Li – Journal of Educational and Behavioral Statistics, 2021
In the research reported here, we propose a new method for scale alignment and test scoring in the context of supporting students with disabilities. In educational assessment, students from these special populations take modified tests because of a demonstrated disability that requires more assistance than standard testing accommodation. Updated…
Descriptors: Students with Disabilities, Scoring, Achievement Tests, Test Items
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ketabi, Somaye; Alavi, Seyyed Mohammed; Ravand, Hamdollah – International Journal of Language Testing, 2021
Although Diagnostic Classification Models (DCMs) were introduced to education system decades ago, it seems that these models were not employed for the original aims upon which they had been designed. Using DCMs has been mostly common in analyzing large-scale non-diagnostic tests and these models have been rarely used in developing Cognitive…
Descriptors: Diagnostic Tests, Test Construction, Goodness of Fit, Classification
Peer reviewed Peer reviewed
Direct linkDirect link
Shafipoor, Mahdieh; Ravand, Hamdollah; Maftoon, Parviz – Language Testing in Asia, 2021
The current study compared the model fit indices, skill mastery probabilities, and classification accuracy of six Diagnostic Classification Models (DCMs): a general model (G-DINA) against five specific models (LLM, RRUM, ACDM, DINA, and DINO). To do so, the response data to the grammar and vocabulary sections of a General English Achievement Test,…
Descriptors: Goodness of Fit, Models, Classification, Grammar
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Tatarinova, Galiya; Neamah, Nour Raheem; Mohammed, Aisha; Hassan, Aalaa Yaseen; Obaid, Ali Abdulridha; Ismail, Ismail Abdulwahhab; Maabreh, Hatem Ghaleb; Afif, Al Khateeb Nashaat Sultan; Viktorovna, Shvedova Irina – International Journal of Language Testing, 2023
Unidimensionality is an important assumption of measurement but it is violated very often. Most of the time, tests are deliberately constructed to be multidimensional to cover all aspects of the intended construct. In such situations, the application of unidimensional item response theory (IRT) models is not justifieddue to poor model fit and…
Descriptors: Item Response Theory, Test Items, Language Tests, Correlation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ehara, Yo – International Educational Data Mining Society, 2022
Language learners are underserved if there are unlearned meanings of a word that they think they have already learned. For example, "circle" as a noun is well known, whereas its use as a verb is not. For artificial-intelligence-based support systems for learning vocabulary, assessing each learner's knowledge of such atypical but common…
Descriptors: Language Tests, Vocabulary Development, Second Language Learning, Second Language Instruction
Peer reviewed Peer reviewed
Direct linkDirect link
Min, Shangchao; Cai, Hongwen; He, Lianzhen – Language Assessment Quarterly, 2022
The present study examined the performance of the bi-factor multidimensional item response theory (MIRT) model and higher-order (HO) cognitive diagnostic models (CDM) in providing diagnostic information and general ability estimation simultaneously in a listening test. The data used were 1,611 examinees' item-level responses to an in-house EFL…
Descriptors: Listening Comprehension Tests, English (Second Language), Second Language Learning, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Geramipour, Masoud – Language Testing in Asia, 2021
Rasch testlet and bifactor models are two measurement models that could deal with local item dependency (LID) in assessing the dimensionality of reading comprehension testlets. This study aimed to apply the measurement models to real item response data of the Iranian EFL reading comprehension tests and compare the validity of the bifactor models…
Descriptors: Foreign Countries, Second Language Learning, English (Second Language), Reading Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Panahi, Ali; Mohebbi, Hassan – Language Teaching Research Quarterly, 2022
High stakes testing, such as IELTS, is designed to select individuals for decision-making purposes (Fulcher, 2013b). Hence, there is a slow-growing stream of research investigating the subskills of IELTS listening and, in feedback terms, its effects on individuals and educational programs. Here, cognitive diagnostic assessment (CDA) performs it…
Descriptors: Decision Making, Listening Comprehension Tests, Multiple Choice Tests, Diagnostic Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Hashimoto, Brett James – Language Assessment Quarterly, 2021
Modern vocabulary size tests are generally based on the notion that the more frequent a word is in a language, the more likely a learner will know that word. However, this assumption has been seldom questioned in the literature concerning vocabulary size tests. Using the Vocabulary of American-English Size Test (VAST) based on the Corpus of…
Descriptors: Word Frequency, Vocabulary Development, Second Language Learning, Second Language Instruction
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Afsharrad, Mohammad; Pishghadam, Reza; Baghaei, Purya – International Journal of Language Testing, 2023
Testing organizations are faced with increasing demand to provide subscores in addition to the total test score. However, psychometricians argue that most subscores do not have added value to be worth reporting. To have added value, subscores need to meet a number of criteria: they should be reliable, distinctive, and distinct from each other and…
Descriptors: Comparative Analysis, Scores, Value Added Models, Psychometrics
Al-Jarf, Reima – Online Submission, 2023
This article aims to give a comprehensive guide to planning and designing vocabulary tests which include Identifying the skills to be covered by the test; outlining the course content covered; preparing a table of specifications that shows the skill, content topics and number of questions allocated to each; and preparing the test instructions. The…
Descriptors: Vocabulary Development, Learning Processes, Test Construction, Course Content
Previous Page | Next Page ยป
Pages: 1  |  2  |  3