ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	7

Descriptor

Error of Measurement	10
Item Response Theory	6
Test Bias	4
Accuracy	3
Models	3
Scores	3
Statistical Analysis	3
Test Items	3
Comparative Analysis	2
Computation	2
Computer Software	2
Evaluation Methods	2
Regression (Statistics)	2
Sample Size	2
Statistical Bias	2
Test Length	2
Ability Grouping	1
College Entrance Examinations	1
College Students	1
Computer Simulation	1
Correlation	1
Course Organization	1
Decision Making	1
Differences	1
Difficulty Level	1
More ▼

Source

Educational and Psychological…	5
Applied Psychological…	1
International Journal of…	1
Journal of Educational and…	1
Structural Equation Modeling:…	1

Author

DeMars, Christine E.	10
Lau, Abigail	1
Phan, Ha	1
Socha, Alan	1
Zilberberg, Anna	1

Publication Type

Journal Articles	9
Reports - Research	7
Reports - Evaluative	2
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Revised Parallel Analysis with Nonnormal Ability and a Guessing Parameter

Peer reviewed

Direct link

DeMars, Christine E. – Educational and Psychological Measurement, 2019

Previous work showing that revised parallel analysis can be effective with dichotomous items has used a two-parameter model and normally distributed abilities. In this study, both two- and three-parameter models were used with normally distributed and skewed ability distributions. Relatively minor skew and kurtosis in the underlying ability…

Descriptors: Item Analysis, Models, Error of Measurement, Item Response Theory

Differential Item Functioning Detection with the Mantel-Haenszel Procedure: The Effects of Matching Types and Other Factors

Peer reviewed

Direct link

Socha, Alan; DeMars, Christine E.; Zilberberg, Anna; Phan, Ha – International Journal of Testing, 2015

The Mantel-Haenszel (MH) procedure is commonly used to detect items that function differentially for groups of examinees from various demographic and linguistic backgrounds--for example, in international assessments. As in some other DIF methods, the total score is used to match examinees on ability. In thin matching, each of the total score…

Descriptors: Test Items, Educational Testing, Evaluation Methods, Ability Grouping

Differential Item Functioning Detection with Latent Classes: How Accurately Can We Detect Who Is Responding Differentially?

Peer reviewed

Direct link

DeMars, Christine E.; Lau, Abigail – Educational and Psychological Measurement, 2011

There is a long history of differential item functioning (DIF) detection methods for known, manifest grouping variables, such as sex or ethnicity. But if the experiences or cognitive processes leading to DIF are not perfectly correlated with the manifest groups, it would be more informative to uncover the latent groups underlying DIF. The use of…

Descriptors: Test Bias, Accuracy, Item Response Theory, Models

Type I Error Inflation for Detecting DIF in the Presence of Impact

Peer reviewed

Direct link

DeMars, Christine E. – Educational and Psychological Measurement, 2010

In this brief explication, two challenges for using differential item functioning (DIF) measures when there are large group differences in true proficiency are illustrated. Each of these difficulties may lead to inflated Type I error rates, for very different reasons. One problem is that groups matched on observed score are not necessarily well…

Descriptors: Test Bias, Error of Measurement, Regression (Statistics), Scores

A Comparison of Limited-Information and Full-Information Methods in M"plus" for Estimating Item Response Theory Parameters for Nonnormal Populations

Peer reviewed

Direct link

DeMars, Christine E. – Structural Equation Modeling: A Multidisciplinary Journal, 2012

In structural equation modeling software, either limited-information (bivariate proportions) or full-information item parameter estimation routines could be used for the 2-parameter item response theory (IRT) model. Limited-information methods assume the continuous variable underlying an item response is normally distributed. For skewed and…

Descriptors: Item Response Theory, Structural Equation Models, Computation, Computer Software

Modification of the Mantel-Haenszel and Logistic Regression DIF Procedures to Incorporate the SIBTEST Regression Correction

Peer reviewed

Direct link

DeMars, Christine E. – Journal of Educational and Behavioral Statistics, 2009

The Mantel-Haenszel (MH) and logistic regression (LR) differential item functioning (DIF) procedures have inflated Type I error rates when there are large mean group differences, short tests, and large sample sizes.When there are large group differences in mean score, groups matched on the observed number-correct score differ on true score,…

Descriptors: Regression (Statistics), Test Bias, Error of Measurement, True Scores

Polytomous Differential Item Functioning and Violations of Ordering of the Expected Latent Trait by the Raw Score

Peer reviewed

Direct link

DeMars, Christine E. – Educational and Psychological Measurement, 2008

The graded response (GR) and generalized partial credit (GPC) models do not imply that examinees ordered by raw observed score will necessarily be ordered on the expected value of the latent trait (OEL). Factors were manipulated to assess whether increased violations of OEL also produced increased Type I error rates in differential item…

Descriptors: Test Items, Raw Scores, Test Theory, Error of Measurement

Type I Error Rates for PARSCALE's Fit Index

Peer reviewed

Direct link

DeMars, Christine E. – Educational and Psychological Measurement, 2005

Type I error rates for PARSCALE's fit statistic were examined. Data were generated to fit the partial credit or graded response model, with test lengths of 10 or 20 items. The ability distribution was simulated to be either normal or uniform. Type I error rates were inflated for the shorter test length and, for the graded-response model, also for…

Descriptors: Test Length, Item Response Theory, Psychometrics, Error of Measurement

Modeling Student Outcomes in a General Education Course with Hierarchical Linear Models. AIR 2002 Forum Paper.

Download full text

DeMars, Christine E. – 2002

When students are nested within course sections, the assumption of independence of residuals is unlikely to be met, unless the course section is explicitly included in the model. Hierarchical linear modeling (HLM) allows for modeling the course section as a random effect, leading to more accurate standard errors. In this study, students chose one…

Descriptors: College Entrance Examinations, College Students, Course Organization, Error of Measurement

Type I Error Rates for Generalized Graded Unfolding Model Fit Indices

Peer reviewed

Direct link

DeMars, Christine E. – Applied Psychological Measurement, 2004

Type I error rates were examined for several fit indices available in GGUM2000: extensions of Infit, Outfit, Andrich's X(2), and the log-likelihood ratio X(2). Infit and Outfit had Type I error rates much lower than nominal alpha. Andrich's X(2) had Type I error rates much higher than nominal alpha, particularly for shorter tests or larger sample…

Descriptors: Likert Scales, Error of Measurement, Goodness of Fit, Psychological Studies