ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	6

Descriptor

Evaluation Methods	7
Probability	7
Simulation	4
Data Analysis	3
Item Response Theory	3
Error Patterns	2
Error of Measurement	2
Goodness of Fit	2
Measurement Techniques	2
Test Bias	2
Test Items	2
Ability	1
Adults	1
Bayesian Statistics	1
Comparative Analysis	1
Competence	1
Computation	1
Computer Software	1
Effect Size	1
Evaluation Research	1
Factor Structure	1
Foreign Countries	1
Grade 9	1
High Stakes Tests	1
Higher Education	1
More ▼

Source

Educational and Psychological…

Author

Zumbo, Bruno D.	2
Beretvas, S. Natasha	1
Berry, Kenneth J.	1
Carstensen, Claus H.	1
Chen, Michelle Y.	1
Drasgow, Fritz	1
Kim, Eun Sook	1
Köhler, Carmen	1
Lee, HwaYoung	1
Lee, Taehun	1
Liu, Yan	1
Mielke, Paul W., Jr.	1
Pohl, Steffi	1
Rupp, Andre A.	1
Tay, Louis	1
Yoon, Myeongsun	1
More ▼

Publication Type

Journal Articles	7
Reports - Research	4
Book/Product Reviews	1
Reports - Descriptive	1
Reports - Evaluative	1

Education Level

Grade 9	1
High Schools	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Germany

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 7 results Save | Export

A Propensity Score Method for Investigating Differential Item Functioning in Performance Assessment

Peer reviewed

Direct link

Chen, Michelle Y.; Liu, Yan; Zumbo, Bruno D. – Educational and Psychological Measurement, 2020

This study introduces a novel differential item functioning (DIF) method based on propensity score matching that tackles two challenges in analyzing performance assessment data, that is, continuous task scores and lack of a reliable internal variable as a proxy for ability or aptitude. The proposed DIF method consists of two main stages. First,…

Descriptors: Probability, Scores, Evaluation Methods, Test Items

Taking the Missing Propensity into Account When Estimating Competence Scores: Evaluation of Item Response Theory Models for Nonignorable Omissions

Peer reviewed

Direct link

Köhler, Carmen; Pohl, Steffi; Carstensen, Claus H. – Educational and Psychological Measurement, 2015

When competence tests are administered, subjects frequently omit items. These missing responses pose a threat to correctly estimating the proficiency level. Newer model-based approaches aim to take nonignorable missing data processes into account by incorporating a latent missing propensity into the measurement model. Two assumptions are typically…

Descriptors: Competence, Tests, Evaluation Methods, Adults

Evaluation of Two Types of Differential Item Functioning in Factor Mixture Models with Binary Outcomes

Peer reviewed

Direct link

Lee, HwaYoung; Beretvas, S. Natasha – Educational and Psychological Measurement, 2014

Conventional differential item functioning (DIF) detection methods (e.g., the Mantel-Haenszel test) can be used to detect DIF only across observed groups, such as gender or ethnicity. However, research has found that DIF is not typically fully explained by an observed variable. True sources of DIF may include unobserved, latent variables, such as…

Descriptors: Item Analysis, Factor Structure, Bayesian Statistics, Goodness of Fit

Testing Measurement Invariance Using MIMIC: Likelihood Ratio Test with a Critical Value Adjustment

Peer reviewed

Direct link

Kim, Eun Sook; Yoon, Myeongsun; Lee, Taehun – Educational and Psychological Measurement, 2012

Multiple-indicators multiple-causes (MIMIC) modeling is often used to test a latent group mean difference while assuming the equivalence of factor loadings and intercepts over groups. However, this study demonstrated that MIMIC was insensitive to the presence of factor loading noninvariance, which implies that factor loading invariance should be…

Descriptors: Test Items, Simulation, Testing, Statistical Analysis

Adjusting the Adjusted X[superscript 2]/df Ratio Statistic for Dichotomous Item Response Theory Analyses: Does the Model Fit?

Peer reviewed

Direct link

Tay, Louis; Drasgow, Fritz – Educational and Psychological Measurement, 2012

Two Monte Carlo simulation studies investigated the effectiveness of the mean adjusted X[superscript 2]/df statistic proposed by Drasgow and colleagues and, because of problems with the method, a new approach for assessing the goodness of fit of an item response theory model was developed. It has been previously recommended that mean adjusted…

Descriptors: Test Length, Monte Carlo Methods, Goodness of Fit, Item Response Theory

Understanding Parameter Invariance in Unidimensional IRT Models

Peer reviewed

Direct link

Rupp, Andre A.; Zumbo, Bruno D. – Educational and Psychological Measurement, 2006

One theoretical feature that makes item response theory (IRT) models those of choice for many psychometric data analysts is parameter invariance, the equality of item and examinee parameters from different examinee populations or measurement conditions. In this article, using the well-known fact that item and examinee parameters are identical only…

Descriptors: Psychometrics, Probability, Simulation, Item Response Theory

Agreement Measure Comparisons between Two Independent Sets of Raters.

Peer reviewed

Berry, Kenneth J.; Mielke, Paul W., Jr. – Educational and Psychological Measurement, 1997

Describes a FORTRAN software program that calculates the probability of an observed difference between agreement measures obtained from two independent sets of raters. An example illustrates the use of the DIFFER program in evaluating undergraduate essays. (Author/SLD)

Descriptors: Comparative Analysis, Computer Software, Evaluation Methods, Higher Education