NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 15 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Beauducel, Andre – Applied Psychological Measurement, 2013
The problem of factor score indeterminacy implies that the factor and the error scores cannot be completely disentangled in the factor model. It is therefore proposed to compute Harman's factor score predictor that contains an additive combination of factor and error variance. This additive combination is discussed in the framework of classical…
Descriptors: Factor Analysis, Predictor Variables, Reliability, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Finch, W. Holmes – Applied Psychological Measurement, 2012
Increasingly, researchers interested in identifying potentially biased test items are encouraged to use a confirmatory, rather than exploratory, approach. One such method for confirmatory testing is rooted in differential bundle functioning (DBF), where hypotheses regarding potential differential item functioning (DIF) for sets of items (bundles)…
Descriptors: Test Bias, Test Items, Statistical Analysis, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Wen-Chung; Shih, Ching-Lin – Applied Psychological Measurement, 2010
Three multiple indicators-multiple causes (MIMIC) methods, namely, the standard MIMIC method (M-ST), the MIMIC method with scale purification (M-SP), and the MIMIC method with a pure anchor (M-PA), were developed to assess differential item functioning (DIF) in polytomous items. In a series of simulations, it appeared that all three methods…
Descriptors: Methods, Test Bias, Test Items, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Paek, Insu – Applied Psychological Measurement, 2010
Conservative bias in rejection of a null hypothesis from using the continuity correction in the Mantel-Haenszel (MH) procedure was examined through simulation in a differential item functioning (DIF) investigation context in which statistical testing uses a prespecified level [alpha] for the decision on an item with respect to DIF. The standard MH…
Descriptors: Test Bias, Statistical Analysis, Sample Size, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Roberts, James S.; Thompson, Vanessa M. – Applied Psychological Measurement, 2011
A marginal maximum a posteriori (MMAP) procedure was implemented to estimate item parameters in the generalized graded unfolding model (GGUM). Estimates from the MMAP method were compared with those derived from marginal maximum likelihood (MML) and Markov chain Monte Carlo (MCMC) procedures in a recovery simulation that varied sample size,…
Descriptors: Statistical Analysis, Markov Processes, Computation, Monte Carlo Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Doyoung; De Ayala, R. J.; Ferdous, Abdullah A.; Nering, Michael L. – Applied Psychological Measurement, 2011
To realize the benefits of item response theory (IRT), one must have model-data fit. One facet of a model-data fit investigation involves assessing the tenability of the conditional item independence (CII) assumption. In this Monte Carlo study, the comparative performance of 10 indices for identifying conditional item dependence is assessed. The…
Descriptors: Item Response Theory, Monte Carlo Methods, Error of Measurement, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Rijmen, Frank; Manalo, Jonathan R.; von Davier, Alina A. – Applied Psychological Measurement, 2009
This article describes two methods for obtaining the standard errors of two commonly used population invariance measures of equating functions: the root mean square difference of the subpopulation equating functions from the overall equating function and the root expected mean square difference. The delta method relies on an analytical…
Descriptors: Error of Measurement, Sampling, Equated Scores, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Emons, Wilco H. M. – Applied Psychological Measurement, 2008
Person-fit methods are used to uncover atypical test performance as reflected in the pattern of scores on individual items in a test. Unlike parametric person-fit statistics, nonparametric person-fit statistics do not require fitting a parametric test theory model. This study investigates the effectiveness of generalizations of nonparametric…
Descriptors: Simulation, Nonparametric Statistics, Item Response Theory, Goodness of Fit
Peer reviewed Peer reviewed
Direct linkDirect link
Lam, Tony C. M.; Kolic, Mary – Applied Psychological Measurement, 2008
Semantic incompatibility, an error in constructing measuring instruments for rating oneself, others, or objects, refers to the extent to which item wordings are incongruent with, and hence inappropriate for, scale labels and vice versa. This study examines the effects of semantic incompatibility on rating responses. Using a 2 x 2 factorial design…
Descriptors: Semantics, Rating Scales, Statistical Analysis, Academic Ability
Peer reviewed Peer reviewed
Forsyth, Robert A. – Applied Psychological Measurement, 1978
This note shows that, under conditions specified by Levin and Subkoviak (TM 503 420), it is not necessary to specify the reliabilities of observed scores when comparing completely randomized designs with randomized block designs. Certain errors in their illustrative example are also discussed. (Author/CTM)
Descriptors: Analysis of Variance, Error of Measurement, Hypothesis Testing, Reliability
Peer reviewed Peer reviewed
Levin, Joel R.; Subkoviak, Michael J. – Applied Psychological Measurement, 1978
Comments (TM 503 706) on an earlier article (TM 503 420) concerning the comparison of the completely randomized design and the randomized block design are acknowledged and appreciated. In addition, potentially misleading notions arising from these comments are addressed and clarified. (See also TM 503 708). (Author/CTM)
Descriptors: Analysis of Variance, Error of Measurement, Hypothesis Testing, Reliability
Peer reviewed Peer reviewed
Forsyth, Robert A. – Applied Psychological Measurement, 1978
This note continues the discussion of earlier articles (TM 503 420, TM 503 706, and TM 503 707), comparing the completely randomized design with the randomized block design. (CTM)
Descriptors: Analysis of Variance, Error of Measurement, Hypothesis Testing, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Sotaridona, Leonardo S.; van der Linden, Wim J.; Meijer, Rob R. – Applied Psychological Measurement, 2006
A statistical test for detecting answer copying on multiple-choice tests based on Cohen's kappa is proposed. The test is free of any assumptions on the response processes of the examinees suspected of copying and having served as the source, except for the usual assumption that these processes are probabilistic. Because the asymptotic null and…
Descriptors: Cheating, Test Items, Simulation, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
DeMars, Christine E. – Applied Psychological Measurement, 2004
Type I error rates were examined for several fit indices available in GGUM2000: extensions of Infit, Outfit, Andrich's X(2), and the log-likelihood ratio X(2). Infit and Outfit had Type I error rates much lower than nominal alpha. Andrich's X(2) had Type I error rates much higher than nominal alpha, particularly for shorter tests or larger sample…
Descriptors: Likert Scales, Error of Measurement, Goodness of Fit, Psychological Studies
Peer reviewed Peer reviewed
Mellenbergh, Gideon J.; van der Linden, Wim J. – Applied Psychological Measurement, 1979
For six tests, coefficient delta as an index for internal optimality is computed. Internal optimality is defined as the magnitude of risk of the decision procedure with respect to the true score. Results are compared with an alternative index (coefficient kappa) for assessing the consistency of decisions. (Author/JKS)
Descriptors: Classification, Comparative Analysis, Decision Making, Error of Measurement