Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 7 |
Descriptor
Error of Measurement | 10 |
Item Response Theory | 6 |
Test Bias | 4 |
Accuracy | 3 |
Models | 3 |
Scores | 3 |
Statistical Analysis | 3 |
Test Items | 3 |
Comparative Analysis | 2 |
Computation | 2 |
Computer Software | 2 |
More ▼ |
Source
Educational and Psychological… | 5 |
Applied Psychological… | 1 |
International Journal of… | 1 |
Journal of Educational and… | 1 |
Structural Equation Modeling:… | 1 |
Author
DeMars, Christine E. | 10 |
Lau, Abigail | 1 |
Phan, Ha | 1 |
Socha, Alan | 1 |
Zilberberg, Anna | 1 |
Publication Type
Journal Articles | 9 |
Reports - Research | 7 |
Reports - Evaluative | 2 |
Reports - Descriptive | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
DeMars, Christine E. – Educational and Psychological Measurement, 2019
Previous work showing that revised parallel analysis can be effective with dichotomous items has used a two-parameter model and normally distributed abilities. In this study, both two- and three-parameter models were used with normally distributed and skewed ability distributions. Relatively minor skew and kurtosis in the underlying ability…
Descriptors: Item Analysis, Models, Error of Measurement, Item Response Theory
Socha, Alan; DeMars, Christine E.; Zilberberg, Anna; Phan, Ha – International Journal of Testing, 2015
The Mantel-Haenszel (MH) procedure is commonly used to detect items that function differentially for groups of examinees from various demographic and linguistic backgrounds--for example, in international assessments. As in some other DIF methods, the total score is used to match examinees on ability. In thin matching, each of the total score…
Descriptors: Test Items, Educational Testing, Evaluation Methods, Ability Grouping
DeMars, Christine E.; Lau, Abigail – Educational and Psychological Measurement, 2011
There is a long history of differential item functioning (DIF) detection methods for known, manifest grouping variables, such as sex or ethnicity. But if the experiences or cognitive processes leading to DIF are not perfectly correlated with the manifest groups, it would be more informative to uncover the latent groups underlying DIF. The use of…
Descriptors: Test Bias, Accuracy, Item Response Theory, Models
DeMars, Christine E. – Educational and Psychological Measurement, 2010
In this brief explication, two challenges for using differential item functioning (DIF) measures when there are large group differences in true proficiency are illustrated. Each of these difficulties may lead to inflated Type I error rates, for very different reasons. One problem is that groups matched on observed score are not necessarily well…
Descriptors: Test Bias, Error of Measurement, Regression (Statistics), Scores
DeMars, Christine E. – Structural Equation Modeling: A Multidisciplinary Journal, 2012
In structural equation modeling software, either limited-information (bivariate proportions) or full-information item parameter estimation routines could be used for the 2-parameter item response theory (IRT) model. Limited-information methods assume the continuous variable underlying an item response is normally distributed. For skewed and…
Descriptors: Item Response Theory, Structural Equation Models, Computation, Computer Software
DeMars, Christine E. – Journal of Educational and Behavioral Statistics, 2009
The Mantel-Haenszel (MH) and logistic regression (LR) differential item functioning (DIF) procedures have inflated Type I error rates when there are large mean group differences, short tests, and large sample sizes.When there are large group differences in mean score, groups matched on the observed number-correct score differ on true score,…
Descriptors: Regression (Statistics), Test Bias, Error of Measurement, True Scores
DeMars, Christine E. – Educational and Psychological Measurement, 2008
The graded response (GR) and generalized partial credit (GPC) models do not imply that examinees ordered by raw observed score will necessarily be ordered on the expected value of the latent trait (OEL). Factors were manipulated to assess whether increased violations of OEL also produced increased Type I error rates in differential item…
Descriptors: Test Items, Raw Scores, Test Theory, Error of Measurement
DeMars, Christine E. – Educational and Psychological Measurement, 2005
Type I error rates for PARSCALE's fit statistic were examined. Data were generated to fit the partial credit or graded response model, with test lengths of 10 or 20 items. The ability distribution was simulated to be either normal or uniform. Type I error rates were inflated for the shorter test length and, for the graded-response model, also for…
Descriptors: Test Length, Item Response Theory, Psychometrics, Error of Measurement
DeMars, Christine E. – 2002
When students are nested within course sections, the assumption of independence of residuals is unlikely to be met, unless the course section is explicitly included in the model. Hierarchical linear modeling (HLM) allows for modeling the course section as a random effect, leading to more accurate standard errors. In this study, students chose one…
Descriptors: College Entrance Examinations, College Students, Course Organization, Error of Measurement
DeMars, Christine E. – Applied Psychological Measurement, 2004
Type I error rates were examined for several fit indices available in GGUM2000: extensions of Infit, Outfit, Andrich's X(2), and the log-likelihood ratio X(2). Infit and Outfit had Type I error rates much lower than nominal alpha. Andrich's X(2) had Type I error rates much higher than nominal alpha, particularly for shorter tests or larger sample…
Descriptors: Likert Scales, Error of Measurement, Goodness of Fit, Psychological Studies