Publication Date
In 2025 | 2 |
Since 2024 | 6 |
Since 2021 (last 5 years) | 12 |
Since 2016 (last 10 years) | 21 |
Since 2006 (last 20 years) | 41 |
Descriptor
Generalization | 50 |
Item Response Theory | 50 |
Models | 22 |
Test Items | 16 |
Foreign Countries | 14 |
Simulation | 14 |
Scores | 11 |
Difficulty Level | 9 |
Error of Measurement | 9 |
Statistical Analysis | 9 |
Correlation | 8 |
More ▼ |
Source
Author
Publication Type
Journal Articles | 39 |
Reports - Research | 25 |
Reports - Evaluative | 13 |
Dissertations/Theses -… | 4 |
Reports - Descriptive | 4 |
Collected Works - Proceedings | 2 |
Books | 1 |
Collected Works - General | 1 |
Opinion Papers | 1 |
Education Level
Higher Education | 8 |
Secondary Education | 7 |
Postsecondary Education | 6 |
Elementary Secondary Education | 5 |
Elementary Education | 4 |
Junior High Schools | 4 |
Middle Schools | 4 |
Intermediate Grades | 2 |
Grade 4 | 1 |
Grade 6 | 1 |
Grade 8 | 1 |
More ▼ |
Audience
Researchers | 1 |
Location
United States | 4 |
Australia | 2 |
California | 2 |
Finland | 2 |
France | 2 |
Hong Kong | 2 |
Singapore | 2 |
Turkey | 2 |
Afghanistan | 1 |
China | 1 |
Illinois (Chicago) | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Trends in International… | 5 |
Program for International… | 3 |
Big Five Inventory | 1 |
California Psychological… | 1 |
Graduate Record Examinations | 1 |
National Assessment of… | 1 |
Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Jean-Paul Fox – Journal of Educational and Behavioral Statistics, 2025
Popular item response theory (IRT) models are considered complex, mainly due to the inclusion of a random factor variable (latent variable). The random factor variable represents the incidental parameter problem since the number of parameters increases when including data of new persons. Therefore, IRT models require a specific estimation method…
Descriptors: Sample Size, Item Response Theory, Accuracy, Bayesian Statistics
Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025
This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…
Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests
Justin L. Kern – Journal of Educational and Behavioral Statistics, 2024
Given the frequent presence of slipping and guessing in item responses, models for the inclusion of their effects are highly important. Unfortunately, the most common model for their inclusion, the four-parameter item response theory model, potentially has severe deficiencies related to its possible unidentifiability. With this issue in mind, the…
Descriptors: Item Response Theory, Models, Bayesian Statistics, Generalization
Chalmers, R. Philip; Zheng, Guoguo – Applied Measurement in Education, 2023
This article presents generalizations of SIBTEST and crossing-SIBTEST statistics for differential item functioning (DIF) investigations involving more than two groups. After reviewing the original two-group setup for these statistics, a set of multigroup generalizations that support contrast matrices for joint tests of DIF are presented. To…
Descriptors: Test Bias, Test Items, Item Response Theory, Error of Measurement
Yongze Xu – Educational and Psychological Measurement, 2024
The questionnaire method has always been an important research method in psychology. The increasing prevalence of multidimensional trait measures in psychological research has led researchers to use longer questionnaires. However, questionnaires that are too long will inevitably reduce the quality of the completed questionnaires and the efficiency…
Descriptors: Item Response Theory, Questionnaires, Generalization, Simulation
Kevin Hirschi; Okim Kang – Language Teaching Research Quarterly, 2023
This paper extends the use of Generalizability Theory to the measurement of extemporaneous L2 speech through the lens of speech perception. Using six datasets of previous studies, it reports on "G studies"--a method of breaking down measurement variance--and "D studies"--a predictive study of the impact on reliability when…
Descriptors: Evaluators, Generalization, Evaluation Methods, Speech Communication
Arslan Mancar, Sinem; Gulleroglu, H. Deniz – International Journal of Assessment Tools in Education, 2022
The aim of this study is to analyse the importance of the number of raters and compare the results obtained by techniques based on Classical Test Theory (CTT) and Generalizability (G) Theory. The Kappa and Krippendorff alpha techniques based on CTT were used to determine the inter-rater reliability. In this descriptive research data consists of…
Descriptors: Comparative Analysis, Interrater Reliability, Advanced Placement, Scoring Rubrics
Qi Huang; Daniel M. Bolt; Weicong Lyu – Large-scale Assessments in Education, 2024
Large scale international assessments depend on invariance of measurement across countries. An important consideration when observing cross-national differential item functioning (DIF) is whether the DIF actually reflects a source of bias, or might instead be a methodological artifact reflecting item response theory (IRT) model misspecification.…
Descriptors: Test Items, Item Response Theory, Test Bias, Test Validity
Soysal, Sümeyra – Participatory Educational Research, 2023
Applying a measurement instrument developed in a specific country to other countries raise a critical and important question of interest in especially cross-cultural studies. Confirmatory factor analysis (CFA) is the most preferred and used method to examine the cross-cultural applicability of measurement tools. Although CFA is a sophisticated…
Descriptors: Generalization, Cross Cultural Studies, Measurement Techniques, Factor Analysis
Clauser, Brian E.; Kane, Michael; Clauser, Jerome C. – Journal of Educational Measurement, 2020
An Angoff standard setting study generally yields judgments on a number of items by a number of judges (who may or may not be nested in panels). Variability associated with judges (and possibly panels) contributes error to the resulting cut score. The variability associated with items plays a more complicated role. To the extent that the mean item…
Descriptors: Cutting Scores, Generalization, Decision Making, Standard Setting
Chengyu Cui; Chun Wang; Gongjun Xu – Grantee Submission, 2024
Multidimensional item response theory (MIRT) models have generated increasing interest in the psychometrics literature. Efficient approaches for estimating MIRT models with dichotomous responses have been developed, but constructing an equally efficient and robust algorithm for polytomous models has received limited attention. To address this gap,…
Descriptors: Item Response Theory, Accuracy, Simulation, Psychometrics
Gin, Brian; Sim, Nicholas; Skrondal, Anders; Rabe-Hesketh, Sophia – Grantee Submission, 2020
We propose a dyadic Item Response Theory (dIRT) model for measuring interactions of pairs of individuals when the responses to items represent the actions (or behaviors, perceptions, etc.) of each individual (actor) made within the context of a dyad formed with another individual (partner). Examples of its use include the assessment of…
Descriptors: Item Response Theory, Generalization, Item Analysis, Problem Solving
Alatli, Betül – International Journal of Education and Literacy Studies, 2020
This study aimed to investigate cross-cultural measurement invariance of the PISA (Programme for International Student Assessment, 2015) science literacy test and items and to carry out a bias study on the items which violate measurement invariance. The study used a descriptive review model. The sample of the study consisted of 2224 students…
Descriptors: Secondary School Students, Foreign Countries, International Assessment, Achievement Tests
Chen, Jianan; van Laar, Saskia; Braeken, Johan – Large-scale Assessments in Education, 2023
A general validity and survey quality concern with student questionnaires under low-stakes assessment conditions is that some responders will not genuinely engage with the questionnaire, often with more random response patterns as a result. Using a mixture IRT approach and a meta-analytic lens across 22 educational systems participating in TIMSS…
Descriptors: Elementary Secondary Education, International Assessment, Achievement Tests, Foreign Countries
Kim, Nana; Bolt, Daniel M. – Educational and Psychological Measurement, 2021
This paper presents a mixture item response tree (IRTree) model for extreme response style. Unlike traditional applications of single IRTree models, a mixture approach provides a way of representing the mixture of respondents following different underlying response processes (between individuals), as well as the uncertainty present at the…
Descriptors: Item Response Theory, Response Style (Tests), Models, Test Items