Publication Date
| In 2026 | 0 |
| Since 2025 | 2 |
| Since 2022 (last 5 years) | 11 |
| Since 2017 (last 10 years) | 18 |
| Since 2007 (last 20 years) | 39 |
Descriptor
Source
Author
Publication Type
| Journal Articles | 39 |
| Reports - Research | 35 |
| Reports - Evaluative | 8 |
| Reports - Descriptive | 5 |
| Speeches/Meeting Papers | 5 |
| Opinion Papers | 2 |
| Dissertations/Theses -… | 1 |
| Tests/Questionnaires | 1 |
Education Level
| Elementary Education | 7 |
| Higher Education | 6 |
| Postsecondary Education | 6 |
| Early Childhood Education | 2 |
| Primary Education | 2 |
| Secondary Education | 2 |
| Adult Education | 1 |
| Grade 3 | 1 |
| Grade 6 | 1 |
| Grade 8 | 1 |
| Intermediate Grades | 1 |
| More ▼ | |
Audience
Location
| China | 3 |
| Iran | 2 |
| Australia | 1 |
| California | 1 |
| Canada | 1 |
| Chile | 1 |
| Germany | 1 |
| Hong Kong | 1 |
| Italy | 1 |
| Kazakhstan | 1 |
| Taiwan | 1 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Jiawei Xiong; George Engelhard; Allan S. Cohen – Measurement: Interdisciplinary Research and Perspectives, 2025
It is common to find mixed-format data results from the use of both multiple-choice (MC) and constructed-response (CR) questions on assessments. Dealing with these mixed response types involves understanding what the assessment is measuring, and the use of suitable measurement models to estimate latent abilities. Past research in educational…
Descriptors: Responses, Test Items, Test Format, Grade 8
Xiaoyan Zhang; Min Wang – Language Teaching Research, 2025
This study examines the effects of the continuation task and the model-as-feedback writing task (MAFW) on English as a foreign language (EFL) vocabulary learning. Three classes of intermediate-level Chinese EFL learners were randomly assigned to a continuation group, a MAFW group, and a control group. Three aspects of vocabulary knowledge --…
Descriptors: Task Analysis, Models, Feedback (Response), Second Language Learning
Liu, Ren; Liu, Haiyan; Shi, Dexin; Jiang, Zhehan – Educational and Psychological Measurement, 2022
Assessments with a large amount of small, similar, or often repetitive tasks are being used in educational, neurocognitive, and psychological contexts. For example, respondents are asked to recognize numbers or letters from a large pool of those and the number of correct answers is a count variable. In 1960, George Rasch developed the Rasch…
Descriptors: Classification, Models, Statistical Distributions, Scores
Alpizar, David; Li, Tongyun; Norris, John M.; Gu, Lixiong – Language Testing, 2023
The C-test is a type of gap-filling test designed to efficiently measure second language proficiency. The typical C-test consists of several short paragraphs with the second half of every second word deleted. The words with deleted parts are considered as items nested within the corresponding paragraph. Given this testlet structure, it is commonly…
Descriptors: Psychometrics, Language Tests, Second Language Learning, Test Items
Dhyaaldian, Safa Mohammed Abdulridah; Kadhim, Qasim Khlaif; Mutlak, Dhameer A.; Neamah, Nour Raheem; Kareem, Zaidoon Hussein; Hamad, Doaa A.; Tuama, Jassim Hassan; Qasim, Mohammed Saad – International Journal of Language Testing, 2022
A C-Test is a gap-filling test for measuring language competence in the first and second language. C-Tests are usually analyzed with polytomous Rasch models by considering each passage as a super-item or testlet. This strategy helps overcome the local dependence inherent in C-Test gaps. However, there is little research on the best polytomous…
Descriptors: Item Response Theory, Cloze Procedure, Reading Tests, Language Tests
Liu, Ren; Huggins-Manley, Anne Corinne; Bulut, Okan – Educational and Psychological Measurement, 2018
Developing a diagnostic tool within the diagnostic measurement framework is the optimal approach to obtain multidimensional and classification-based feedback on examinees. However, end users may seek to obtain diagnostic feedback from existing item responses to assessments that have been designed under either the classical test theory or item…
Descriptors: Models, Item Response Theory, Psychometrics, Test Construction
Forthmann, Boris; Grotjahn, Rüdiger; Doebler, Philipp; Baghaei, Purya – Journal of Psychoeducational Assessment, 2020
As measures of general language proficiency, C-tests are ubiquitous in language testing. Speeded C-tests are quite recent developments in the field and are deemed to be more discriminatory and provide more accurate diagnostic information than power C-tests especially with high-ability participants. Item response theory modeling of speeded C-tests…
Descriptors: Item Response Theory, Timed Tests, Language Tests, Goodness of Fit
Boxuan Ma; Sora Fukui; Yuji Ando; Shinichi Konomi – Journal of Educational Data Mining, 2024
Language proficiency diagnosis is essential to extract fine-grained information about the linguistic knowledge states and skill mastery levels of test takers based on their performance on language tests. Different from comprehensive standardized tests, many language learning apps often revolve around word-level questions. Therefore, knowledge…
Descriptors: Language Proficiency, Brain Hemisphere Functions, Language Processing, Task Analysis
Tatarinova, Galiya; Neamah, Nour Raheem; Mohammed, Aisha; Hassan, Aalaa Yaseen; Obaid, Ali Abdulridha; Ismail, Ismail Abdulwahhab; Maabreh, Hatem Ghaleb; Afif, Al Khateeb Nashaat Sultan; Viktorovna, Shvedova Irina – International Journal of Language Testing, 2023
Unidimensionality is an important assumption of measurement but it is violated very often. Most of the time, tests are deliberately constructed to be multidimensional to cover all aspects of the intended construct. In such situations, the application of unidimensional item response theory (IRT) models is not justifieddue to poor model fit and…
Descriptors: Item Response Theory, Test Items, Language Tests, Correlation
Ehara, Yo – International Educational Data Mining Society, 2022
Language learners are underserved if there are unlearned meanings of a word that they think they have already learned. For example, "circle" as a noun is well known, whereas its use as a verb is not. For artificial-intelligence-based support systems for learning vocabulary, assessing each learner's knowledge of such atypical but common…
Descriptors: Language Tests, Vocabulary Development, Second Language Learning, Second Language Instruction
Min, Shangchao; Cai, Hongwen; He, Lianzhen – Language Assessment Quarterly, 2022
The present study examined the performance of the bi-factor multidimensional item response theory (MIRT) model and higher-order (HO) cognitive diagnostic models (CDM) in providing diagnostic information and general ability estimation simultaneously in a listening test. The data used were 1,611 examinees' item-level responses to an in-house EFL…
Descriptors: Listening Comprehension Tests, English (Second Language), Second Language Learning, Foreign Countries
Geramipour, Masoud – Language Testing in Asia, 2021
Rasch testlet and bifactor models are two measurement models that could deal with local item dependency (LID) in assessing the dimensionality of reading comprehension testlets. This study aimed to apply the measurement models to real item response data of the Iranian EFL reading comprehension tests and compare the validity of the bifactor models…
Descriptors: Foreign Countries, Second Language Learning, English (Second Language), Reading Tests
Panahi, Ali; Mohebbi, Hassan – Language Teaching Research Quarterly, 2022
High stakes testing, such as IELTS, is designed to select individuals for decision-making purposes (Fulcher, 2013b). Hence, there is a slow-growing stream of research investigating the subskills of IELTS listening and, in feedback terms, its effects on individuals and educational programs. Here, cognitive diagnostic assessment (CDA) performs it…
Descriptors: Decision Making, Listening Comprehension Tests, Multiple Choice Tests, Diagnostic Tests
Chu, Wei; Pavlik, Philip I., Jr. – International Educational Data Mining Society, 2023
In adaptive learning systems, various models are employed to obtain the optimal learning schedule and review for a specific learner. Models of learning are used to estimate the learner's current recall probability by incorporating features or predictors proposed by psychological theory or empirically relevant to learners' performance. Logistic…
Descriptors: Reaction Time, Accuracy, Models, Predictor Variables
van Rijn, Peter W.; Ali, Usama S. – ETS Research Report Series, 2018
A computer program was developed to estimate speed-accuracy response models for dichotomous items. This report describes how the models are estimated and how to specify data and input files. An example using data from a listening section of an international language test is described to illustrate the modeling approach and features of the computer…
Descriptors: Computer Software, Computation, Reaction Time, Timed Tests

Peer reviewed
Direct link
