ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	4

Descriptor

Error of Measurement	8
Evaluation Methods	8
Item Response Theory	5
Simulation	4
Multiple Regression Analysis	3
Psychological Studies	3
Test Bias	3
Equated Scores	2
Maximum Likelihood Statistics	2
Statistical Bias	2
Test Results	2
Testing	2
Adaptive Testing	1
Bias	1
Comparative Testing	1
Computation	1
Computer Assisted Testing	1
Computer Software	1
Correlation	1
Criteria	1
Error Patterns	1
Foreign Countries	1
Goodness of Fit	1
Interaction	1
Interviews	1
More ▼

Source

Applied Psychological…

Author

Woods, Carol M.	2
DeMars, Christine E.	1
Hanson, Bradley A.	1
Harris, Deborah J.	1
Kang, Sun-Mee	1
Rae, Gordon	1
Su, Ya-Hui	1
Waller, Niels G.	1
Wang, Tianyou	1
Wang, Wen-Chung	1
van der Linden, Wim J.	1
More ▼

Publication Type

Journal Articles	8
Reports - Evaluative	5
Reports - Research	3

Education Level

Audience

Practitioners

Location

Taiwan

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 8 results Save | Export

Ramsay-Curve Differential Item Functioning

Peer reviewed

Direct link

Woods, Carol M. – Applied Psychological Measurement, 2011

Differential item functioning (DIF) occurs when an item on a test, questionnaire, or interview has different measurement properties for one group of people versus another, irrespective of true group-mean differences on the constructs being measured. This article is focused on item response theory based likelihood ratio testing for DIF (IRT-LR or…

Descriptors: Simulation, Item Response Theory, Testing, Questionnaires

Empirical Selection of Anchors for Tests of Differential Item Functioning

Peer reviewed

Direct link

Woods, Carol M. – Applied Psychological Measurement, 2009

Differential item functioning (DIF) occurs when items on a test or questionnaire have different measurement properties for one group of people versus another, irrespective of group-mean differences on the construct. Methods for testing DIF require matching members of different groups on an estimate of the construct. Preferably, the estimate is…

Descriptors: Test Results, Testing, Item Response Theory, Test Bias

Correcting Coefficient Alpha for Correlated Errors: Is [alpha][K]a Lower Bound to Reliability?

Peer reviewed

Direct link

Rae, Gordon – Applied Psychological Measurement, 2006

When errors of measurement are positively correlated, coefficient alpha may overestimate the "true" reliability of a composite. To reduce this inflation bias, Komaroff (1997) has proposed an adjusted alpha coefficient, ak. This article shows that ak is only guaranteed to be a lower bound to reliability if the latter does not include correlated…

Descriptors: Correlation, Reliability, Error of Measurement, Evaluation Methods

The Effectiveness of Circular Equating as a Criterion for Evaluating Equating.

Peer reviewed

Wang, Tianyou; Hanson, Bradley A.; Harris, Deborah J. – Applied Psychological Measurement, 2000

Studied whether circular equating could provide an adequate measure of various types of equating error when applied to different equating methods under different equating designs. Analyses and simluations show that circular equating is generally invalid as a criterion to evaluate the adequacy of equating. (SLD)

Descriptors: Criteria, Equated Scores, Error of Measurement, Evaluation Methods

Type I Error Rates for Generalized Graded Unfolding Model Fit Indices

Peer reviewed

Direct link

DeMars, Christine E. – Applied Psychological Measurement, 2004

Type I error rates were examined for several fit indices available in GGUM2000: extensions of Infit, Outfit, Andrich's X(2), and the log-likelihood ratio X(2). Infit and Outfit had Type I error rates much lower than nominal alpha. Andrich's X(2) had Type I error rates much higher than nominal alpha, particularly for shorter tests or larger sample…

Descriptors: Likert Scales, Error of Measurement, Goodness of Fit, Psychological Studies

Moderated Multiple Regression, Spurious Interaction Effects, and IRT

Peer reviewed

Direct link

Kang, Sun-Mee; Waller, Niels G. – Applied Psychological Measurement, 2005

Two Monte Carlo studies were conducted to explore the Type I error rates in moderated multiple regression (MMR) of observed scores and estimated latent trait scores from a two-parameter logistic item response theory (IRT) model. The results of both studies showed that MMR Type I error rates were substantially higher than the nominal alpha levels…

Descriptors: Multiple Regression Analysis, Interaction, Monte Carlo Methods, Item Response Theory

Factors Influencing the Mantel and Generalized Mantel-Haenszel Methods for the Assessment of Differential Item Functioning in Polytomous Items

Peer reviewed

Direct link

Wang, Wen-Chung; Su, Ya-Hui – Applied Psychological Measurement, 2004

Eight independent variables (differential item functioning [DIF] detection method, purification procedure, item response model, mean latent trait difference between groups, test length, DIF pattern, magnitude of DIF, and percentage of DIF items) were manipulated, and two dependent variables (Type I error and power) were assessed through…

Descriptors: Test Length, Test Bias, Simulation, Item Response Theory

Equating Scores from Adaptive to Linear Tests

Peer reviewed

Direct link

van der Linden, Wim J. – Applied Psychological Measurement, 2006

Two local methods for observed-score equating are applied to the problem of equating an adaptive test to a linear test. In an empirical study, the methods were evaluated against a method based on the test characteristic function (TCF) of the linear test and traditional equipercentile equating applied to the ability estimates on the adaptive test…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Format, Equated Scores