ERIC - Search Results

Publication Date

In 2025	1
Since 2024	4

Descriptor

Hierarchical Linear Modeling	4
Item Response Theory	4
Algorithms	2
Evaluation Methods	2
Models	2
Accuracy	1
Achievement Tests	1
Bias	1
Classification	1
Computation	1
Correlation	1
Data Analysis	1
Data Collection	1
Equated Scores	1
Error of Measurement	1
Evaluation Criteria	1
Evaluators	1
Foreign Countries	1
International Assessment	1
Item Analysis	1
Measurement Techniques	1
Measures (Individuals)	1
Performance	1
Robustness (Statistics)	1
Secondary School Students	1
More ▼

Source

Journal of Educational…	2
Educational and Psychological…	1
Journal of Educational and…	1

Author

Sijia Huang	2
Artur Pokropek	1
Carl Westine	1
Carmen Köhler	1
Dubravka Svetina Valdivia	1
Johannes Hartig	1
Lale Khorramdel	1
Li Cai	1
Michelle Boyer	1
Stella Y. Kim	1
Tong Wu	1
More ▼

Publication Type

Journal Articles	4
Reports - Research	3
Reports - Evaluative	1

Education Level

Higher Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Program for International…

What Works Clearinghouse Rating

Showing all 4 results Save | Export

Wald X[superscript 2] Test for Differential Item Functioning Detection with Polytomous Items in Multilevel Data

Peer reviewed

Direct link

Sijia Huang; Dubravka Svetina Valdivia – Educational and Psychological Measurement, 2024

Identifying items with differential item functioning (DIF) in an assessment is a crucial step for achieving equitable measurement. One critical issue that has not been fully addressed with existing studies is how DIF items can be detected when data are multilevel. In the present study, we introduced a Lord's Wald X[superscript 2] test-based…

Descriptors: Item Analysis, Item Response Theory, Algorithms, Accuracy

IRT Observed-Score Equating for Rater-Mediated Assessments Using a Hierarchical Rater Model

Peer reviewed

Direct link

Tong Wu; Stella Y. Kim; Carl Westine; Michelle Boyer – Journal of Educational Measurement, 2025

While significant attention has been given to test equating to ensure score comparability, limited research has explored equating methods for rater-mediated assessments, where human raters inherently introduce error. If not properly addressed, these errors can undermine score interchangeability and test validity. This study proposes an equating…

Descriptors: Item Response Theory, Evaluators, Error of Measurement, Test Validity

Cross-Classified Item Response Theory Modeling with an Application to Student Evaluation of Teaching

Peer reviewed

Direct link

Sijia Huang; Li Cai – Journal of Educational and Behavioral Statistics, 2024

The cross-classified data structure is ubiquitous in education, psychology, and health outcome sciences. In these areas, assessment instruments that are made up of multiple items are frequently used to measure latent constructs. The presence of both the cross-classified structure and multivariate categorical outcomes leads to the so-called…

Descriptors: Classification, Data Collection, Data Analysis, Item Response Theory

DIF Detection for Multiple Groups: Comparing Three-Level GLMMs and Multiple-Group IRT Models

Peer reviewed

Direct link

Carmen Köhler; Lale Khorramdel; Artur Pokropek; Johannes Hartig – Journal of Educational Measurement, 2024

For assessment scales applied to different groups (e.g., students from different states; patients in different countries), multigroup differential item functioning (MG-DIF) needs to be evaluated in order to ensure that respondents with the same trait level but from different groups have equal response probabilities on a particular item. The…

Descriptors: Measures (Individuals), Test Bias, Models, Item Response Theory