Publication Date
In 2025 | 1 |
Since 2024 | 4 |
Since 2021 (last 5 years) | 17 |
Since 2016 (last 10 years) | 42 |
Since 2006 (last 20 years) | 79 |
Descriptor
Goodness of Fit | 94 |
Models | 94 |
Test Items | 94 |
Item Response Theory | 41 |
Foreign Countries | 23 |
Statistical Analysis | 21 |
Psychometrics | 19 |
Item Analysis | 17 |
Test Construction | 16 |
Test Reliability | 16 |
Factor Analysis | 15 |
More ▼ |
Source
Author
Ravand, Hamdollah | 3 |
Sinharay, Sandip | 3 |
Wang, Chun | 3 |
Akbay, Lokman | 2 |
Baghaei, Purya | 2 |
Cai, Li | 2 |
Champagne, Zachary M. | 2 |
Chang, Hua-Hua | 2 |
Farina, Kristy | 2 |
He, Lianzhen | 2 |
LaVenia, Mark | 2 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 2 |
Practitioners | 1 |
Students | 1 |
Location
China | 4 |
Iran | 3 |
South Korea | 3 |
Turkey | 3 |
Germany | 2 |
United Kingdom | 2 |
Argentina | 1 |
Australia | 1 |
Belgium | 1 |
Canada | 1 |
France | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Gorney, Kylie; Wollack, James A. – Journal of Educational Measurement, 2023
In order to detect a wide range of aberrant behaviors, it can be useful to incorporate information beyond the dichotomous item scores. In this paper, we extend the l[subscript z] and l*[subscript z] person-fit statistics so that unusual behavior in item scores and unusual behavior in item distractors can be used as indicators of aberrance. Through…
Descriptors: Test Items, Scores, Goodness of Fit, Statistics
Kuan-Yu Jin; Wai-Lok Siu – Journal of Educational Measurement, 2025
Educational tests often have a cluster of items linked by a common stimulus ("testlet"). In such a design, the dependencies caused between items are called "testlet effects." In particular, the directional testlet effect (DTE) refers to a recursive influence whereby responses to earlier items can positively or negatively affect…
Descriptors: Models, Test Items, Educational Assessment, Scores
Stephanie M. Bell; R. Philip Chalmers; David B. Flora – Educational and Psychological Measurement, 2024
Coefficient omega indices are model-based composite reliability estimates that have become increasingly popular. A coefficient omega index estimates how reliably an observed composite score measures a target construct as represented by a factor in a factor-analysis model; as such, the accuracy of omega estimates is likely to depend on correct…
Descriptors: Influences, Models, Measurement Techniques, Reliability
Joakim Wallmark; James O. Ramsay; Juan Li; Marie Wiberg – Journal of Educational and Behavioral Statistics, 2024
Item response theory (IRT) models the relationship between the possible scores on a test item against a test taker's attainment of the latent trait that the item is intended to measure. In this study, we compare two models for tests with polytomously scored items: the optimal scoring (OS) model, a nonparametric IRT model based on the principles of…
Descriptors: Item Response Theory, Test Items, Models, Scoring
Emily A. Brown – ProQuest LLC, 2024
Previous research has been limited regarding the measurement of computational thinking, particularly as a learning progression in K-12. This study proposes to apply a multidimensional item response theory (IRT) model to a newly developed measure of computational thinking utilizing both selected response and open-ended polytomous items to establish…
Descriptors: Models, Computation, Thinking Skills, Item Response Theory
Carol Eckerly; Yue Jia; Paul Jewsbury – ETS Research Report Series, 2022
Testing programs have explored the use of technology-enhanced items alongside traditional item types (e.g., multiple-choice and constructed-response items) as measurement evidence of latent constructs modeled with item response theory (IRT). In this report, we discuss considerations in applying IRT models to a particular type of adaptive testlet…
Descriptors: Computer Assisted Testing, Test Items, Item Response Theory, Scoring
Haimiao Yuan – ProQuest LLC, 2022
The application of diagnostic classification models (DCMs) in the field of educational measurement is getting more attention in recent years. To make a valid inference from the model, it is important to ensure that the model fits the data. The purpose of the present study was to investigate the performance of the limited information…
Descriptors: Goodness of Fit, Educational Assessment, Educational Diagnosis, Models
Su, Shiyang; Wang, Chun; Weiss, David J. – Educational and Psychological Measurement, 2021
S-X[superscript 2] is a popular item fit index that is available in commercial software packages such as "flex"MIRT. However, no research has systematically examined the performance of S-X[superscript 2] for detecting item misfit within the context of the multidimensional graded response model (MGRM). The primary goal of this study was…
Descriptors: Statistics, Goodness of Fit, Test Items, Models
Lozano, José H.; Revuelta, Javier – Educational and Psychological Measurement, 2023
The present paper introduces a general multidimensional model to measure individual differences in learning within a single administration of a test. Learning is assumed to result from practicing the operations involved in solving the items. The model accounts for the possibility that the ability to learn may manifest differently for correct and…
Descriptors: Bayesian Statistics, Learning Processes, Test Items, Item Analysis
Dhyaaldian, Safa Mohammed Abdulridah; Kadhim, Qasim Khlaif; Mutlak, Dhameer A.; Neamah, Nour Raheem; Kareem, Zaidoon Hussein; Hamad, Doaa A.; Tuama, Jassim Hassan; Qasim, Mohammed Saad – International Journal of Language Testing, 2022
A C-Test is a gap-filling test for measuring language competence in the first and second language. C-Tests are usually analyzed with polytomous Rasch models by considering each passage as a super-item or testlet. This strategy helps overcome the local dependence inherent in C-Test gaps. However, there is little research on the best polytomous…
Descriptors: Item Response Theory, Cloze Procedure, Reading Tests, Language Tests
Ketabi, Somaye; Alavi, Seyyed Mohammed; Ravand, Hamdollah – International Journal of Language Testing, 2021
Although Diagnostic Classification Models (DCMs) were introduced to education system decades ago, it seems that these models were not employed for the original aims upon which they had been designed. Using DCMs has been mostly common in analyzing large-scale non-diagnostic tests and these models have been rarely used in developing Cognitive…
Descriptors: Diagnostic Tests, Test Construction, Goodness of Fit, Classification
Shafipoor, Mahdieh; Ravand, Hamdollah; Maftoon, Parviz – Language Testing in Asia, 2021
The current study compared the model fit indices, skill mastery probabilities, and classification accuracy of six Diagnostic Classification Models (DCMs): a general model (G-DINA) against five specific models (LLM, RRUM, ACDM, DINA, and DINO). To do so, the response data to the grammar and vocabulary sections of a General English Achievement Test,…
Descriptors: Goodness of Fit, Models, Classification, Grammar
Akbay, Lokman; Kilinç, Mustafa – International Journal of Assessment Tools in Education, 2018
Measurement models need to properly delineate the real aspect of examinees' response processes for measurement accuracy purposes. To avoid invalid inferences, fit of examinees' response data to the model is studied through "person-fit" statistics. Misfit between the examinee response data and measurement model may be due to invalid…
Descriptors: Reliability, Goodness of Fit, Cognitive Measurement, Models
Torre, Jimmy de la; Akbay, Lokman – Eurasian Journal of Educational Research, 2019
Purpose: Well-designed assessment methodologies and various cognitive diagnosis models (CDMs) to extract diagnostic information about examinees' individual strengths and weaknesses have been developed. Due to this novelty, as well as educational specialists' lack of familiarity with CDMs, their applications are not widespread. This article aims at…
Descriptors: Cognitive Measurement, Models, Computer Software, Testing
Maseko, Jeremiah; Luneta, Kakoma; Long, Caroline – Pythagoras, 2019
The rational number knowledge of student teachers, in particular the equivalence of fractions, decimals, and percentages, and their comparison and ordering, is the focus of this article. An instrument comprising multiple choice, short answer and constructed response formats was designed to test conceptual and procedural understanding. Application…
Descriptors: Mathematics Instruction, Number Concepts, Test Validity, Models