NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Location
Georgia1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 13 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023
This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…
Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Duxbury, Scott W. – Sociological Methods & Research, 2023
This study shows that residual variation can cause problems related to scaling in exponential random graph models (ERGM). Residual variation is likely to exist when there are unmeasured variables in a model--even those uncorrelated with other predictors--or when the logistic form of the model is inappropriate. As a consequence, coefficients cannot…
Descriptors: Graphs, Scaling, Research Problems, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Kang, Hyeon-Ah; Lu, Ying; Chang, Hua-Hua – Applied Measurement in Education, 2017
Increasing use of item pools in large-scale educational assessments calls for an appropriate scaling procedure to achieve a common metric among field-tested items. The present study examines scaling procedures for developing a new item pool under a spiraled block linking design. The three scaling procedures are considered: (a) concurrent…
Descriptors: Item Response Theory, Accuracy, Educational Assessment, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Sturz, Bradley R.; Bell, Z. Kade; Bodily, Kent D. – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2018
During spatial reorientation, the use of local geometric cues (e.g., corner angles) and global geometric cues (e.g., principal axis) is differentially influenced by enclosure size. Local geometric cues exert more influence in large enclosures compared to small enclosures, whereas the use of global geometric cues is not influenced by changes in…
Descriptors: Spatial Ability, Comparative Analysis, Testing, Classification
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Attali, Yigal; Saldivia, Luis; Jackson, Carol; Schuppan, Fred; Wanamaker, Wilbur – ETS Research Report Series, 2014
Previous investigations of the ability of content experts and test developers to estimate item difficulty have, for themost part, produced disappointing results. These investigations were based on a noncomparative method of independently rating the difficulty of items. In this article, we argue that, by eliciting comparative judgments of…
Descriptors: Test Items, Difficulty Level, Comparative Analysis, College Entrance Examinations
Peer reviewed Peer reviewed
Direct linkDirect link
Johnson, Emily C.; Meade, Adam W.; DuVernet, Amy M. – Structural Equation Modeling: A Multidisciplinary Journal, 2009
Confirmatory factor analytic tests of measurement invariance (MI) require a referent indicator (RI) for model identification. Although the assumption that the RI is perfectly invariant across groups is acknowledged as problematic, the literature provides relatively little guidance for researchers to identify the conditions under which the practice…
Descriptors: Measurement, Validity, Factor Analysis, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Eglash, Ron; Krishnamoorthy, Mukkai; Sanchez, Jason; Woodbridge, Andrew – ACM Transactions on Computing Education, 2011
This article describes the use of fractal simulations of African design in a high school computing class. Fractal patterns--repetitions of shape at multiple scales--are a common feature in many aspects of African design. In African architecture we often see circular houses grouped in circular complexes, or rectangular houses in rectangular…
Descriptors: High School Students, Indigenous Knowledge, Ceremonies, African Culture
Tang, K. Linda; And Others – 1993
This study compared the performance of the LOGIST and BILOG computer programs on item response theory (IRT) based scaling and equating for the Test of English as a Foreign Language (TOEFL) using real and simulated data and two calibration structures. Applications of IRT for the TOEFL program are based on the three-parameter logistic (3PL) model.…
Descriptors: Comparative Analysis, Computer Simulation, Equated Scores, Estimation (Mathematics)
Peer reviewed Peer reviewed
Burket, George R.; Yen, Wendy M. – Journal of Educational Measurement, 1997
Using simulated data modeled after real tests, a Thurstone method (L. Thurstone, 1925 and later) and three-parameter item response theory were compared for vertical scaling. Neither procedure produced artificial scale shrinkage, and both produced modest scale expansion for one simulated condition. (SLD)
Descriptors: Comparative Analysis, Item Response Theory, Scaling, Simulation
Peer reviewed Peer reviewed
Liou, Michelle – Applied Psychological Measurement, 1990
The effect of scale selection on error in calibrating item and ability parameters was investigated, with particular reference to the standardized mean-squared difference (SMSD) statistic. Through simulation, three scaling methods for selecting the common scale were used to demonstrate their effects on SMSD values. (SLD)
Descriptors: Comparative Analysis, Computer Simulation, Equations (Mathematics), Mathematical Models
Peer reviewed Peer reviewed
Direct linkDirect link
Custer, Michael; Omar, Md Hafidz; Pomplun, Mark – Applied Measurement in Education, 2006
This study compared vertical scaling results for the Rasch model from BILOG-MG and WINSTEPS. The item and ability parameters for the simulated vocabulary tests were scaled across 11 grades; kindergarten through 10th. Data were based on real data and were simulated under normal and skewed distribution assumptions. WINSTEPS and BILOG-MG were each…
Descriptors: Models, Scaling, Computer Software, Vocabulary
Peer reviewed Peer reviewed
Fitzpatrick, Anne R.; And Others – Journal of Educational Measurement, 1996
One-parameter (1PPC) and two-parameter partial credit (2PPC) models were compared using real and simulated data with constructed response items present. Results suggest that the more flexible three-parameter logistic-2PPC model combination produces better model fit than the combination of the one-parameter logistic and the 1PPC models. (SLD)
Descriptors: Comparative Analysis, Constructed Response, Goodness of Fit, Performance Based Assessment
Morrison, Carol A.; Fitzpatrick, Steven J. – 1992
An attempt was made to determine which item response theory (IRT) equating method results in the least amount of equating error or "scale drift" when equating scores across one or more test forms. An internal anchor test design was employed with five different test forms, each consisting of 30 items, 10 in common with the base test and 5…
Descriptors: Comparative Analysis, Computer Simulation, Equated Scores, Error of Measurement