ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	8

Descriptor

Comparative Analysis	13
Scaling	13
Simulation	10
Item Response Theory	6
Computer Simulation	4
Test Items	4
Models	3
Sample Size	3
Correlation	2
Equated Scores	2
Error of Measurement	2
Evaluation Criteria	2
High School Students	2
Probability	2
Statistical Analysis	2
Teaching Methods	2
Accuracy	1
Adolescents	1
African American Students	1
African Culture	1
Algorithms	1
Auditory Stimuli	1
Bayesian Statistics	1
Ceremonies	1
Classification	1
More ▼

Source

Applied Measurement in…	3
Journal of Educational…	2
ACM Transactions on Computing…	1
Applied Psychological…	1
ETS Research Report Series	1
Journal of Experimental…	1
Sociological Methods &…	1
Structural Equation Modeling:…	1

Publication Type

Journal Articles	11
Reports - Evaluative	6
Reports - Research	6
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

High Schools	2
Higher Education	2
Elementary Secondary Education	1
Kindergarten	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Georgia

Laws, Policies, & Programs

Assessments and Surveys

National Longitudinal Study…	1
SAT (College Admission Test)	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Maintaining Score Scales over Time: A Comparison of Five Scoring Methods

Peer reviewed

Direct link

Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023

This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…

Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation

The Problem of Scaling in Exponential Random Graph Models

Peer reviewed

Direct link

Duxbury, Scott W. – Sociological Methods & Research, 2023

This study shows that residual variation can cause problems related to scaling in exponential random graph models (ERGM). Residual variation is likely to exist when there are unmeasured variables in a model--even those uncorrelated with other predictors--or when the logistic form of the model is inappropriate. As a consequence, coefficients cannot…

Descriptors: Graphs, Scaling, Research Problems, Models

IRT Item Parameter Scaling for Developing New Item Pools

Peer reviewed

Direct link

Kang, Hyeon-Ah; Lu, Ying; Chang, Hua-Hua – Applied Measurement in Education, 2017

Increasing use of item pools in large-scale educational assessments calls for an appropriate scaling procedure to achieve a common metric among field-tested items. The present study examines scaling procedures for developing a new item pool under a spiraled block linking design. The three scaling procedures are considered: (a) concurrent…

Descriptors: Item Response Theory, Accuracy, Educational Assessment, Test Items

Environmental Scaling Influences the Use of Local but Not Global Geometric Cues during Spatial Reorientation

Peer reviewed

Direct link

Sturz, Bradley R.; Bell, Z. Kade; Bodily, Kent D. – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2018

During spatial reorientation, the use of local geometric cues (e.g., corner angles) and global geometric cues (e.g., principal axis) is differentially influenced by enclosure size. Local geometric cues exert more influence in large enclosures compared to small enclosures, whereas the use of global geometric cues is not influenced by changes in…

Descriptors: Spatial Ability, Comparative Analysis, Testing, Classification

Estimating Item Difficulty with Comparative Judgments. Research Report. ETS RR-14-39

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Saldivia, Luis; Jackson, Carol; Schuppan, Fred; Wanamaker, Wilbur – ETS Research Report Series, 2014

Previous investigations of the ability of content experts and test developers to estimate item difficulty have, for themost part, produced disappointing results. These investigations were based on a noncomparative method of independently rating the difficulty of items. In this article, we argue that, by eliciting comparative judgments of…

Descriptors: Test Items, Difficulty Level, Comparative Analysis, College Entrance Examinations

The Role of Referent Indicators in Tests of Measurement Invariance

Peer reviewed

Direct link

Johnson, Emily C.; Meade, Adam W.; DuVernet, Amy M. – Structural Equation Modeling: A Multidisciplinary Journal, 2009

Confirmatory factor analytic tests of measurement invariance (MI) require a referent indicator (RI) for model identification. Although the assumption that the RI is perfectly invariant across groups is acknowledged as problematic, the literature provides relatively little guidance for researchers to identify the conditions under which the practice…

Descriptors: Measurement, Validity, Factor Analysis, Models

Fractal Simulations of African Design in Pre-College Computing Education

Peer reviewed

Direct link

Eglash, Ron; Krishnamoorthy, Mukkai; Sanchez, Jason; Woodbridge, Andrew – ACM Transactions on Computing Education, 2011

This article describes the use of fractal simulations of African design in a high school computing class. Fractal patterns--repetitions of shape at multiple scales--are a common feature in many aspects of African design. In African architecture we often see circular houses grouped in circular complexes, or rectangular houses in rectangular…

Descriptors: High School Students, Indigenous Knowledge, Ceremonies, African Culture

The Effect of Small Calibration Sample Sizes on TOEFL IRT-Based Equating.

Download full text

Tang, K. Linda; And Others – 1993

This study compared the performance of the LOGIST and BILOG computer programs on item response theory (IRT) based scaling and equating for the Test of English as a Foreign Language (TOEFL) using real and simulated data and two calibration structures. Applications of IRT for the TOEFL program are based on the three-parameter logistic (3PL) model.…

Descriptors: Comparative Analysis, Computer Simulation, Equated Scores, Estimation (Mathematics)

Comparison of Item Response Theory and Thurstone Methods of Vertical Scaling.

Peer reviewed

Burket, George R.; Yen, Wendy M. – Journal of Educational Measurement, 1997

Using simulated data modeled after real tests, a Thurstone method (L. Thurstone, 1925 and later) and three-parameter item response theory were compared for vertical scaling. Neither procedure produced artificial scale shrinkage, and both produced modest scale expansion for one simulated condition. (SLD)

Descriptors: Comparative Analysis, Item Response Theory, Scaling, Simulation

Effect of Scale Adjustment on the Comparison of Item and Ability Parameters.

Peer reviewed

Liou, Michelle – Applied Psychological Measurement, 1990

The effect of scale selection on error in calibrating item and ability parameters was investigated, with particular reference to the standardized mean-squared difference (SMSD) statistic. Through simulation, three scaling methods for selecting the common scale were used to demonstrate their effects on SMSD values. (SLD)

Descriptors: Comparative Analysis, Computer Simulation, Equations (Mathematics), Mathematical Models

Vertical Scaling with the Rasch Model Utilizing Default and Tight Convergence Settings with WINSTEPS and BILOG-MG

Peer reviewed

Direct link

Custer, Michael; Omar, Md Hafidz; Pomplun, Mark – Applied Measurement in Education, 2006

This study compared vertical scaling results for the Rasch model from BILOG-MG and WINSTEPS. The item and ability parameters for the simulated vocabulary tests were scaled across 11 grades; kindergarten through 10th. Data were based on real data and were simulated under normal and skewed distribution assumptions. WINSTEPS and BILOG-MG were each…

Descriptors: Models, Scaling, Computer Software, Vocabulary

Scaling Performance Assessments: A Comparison of One-Parameter and Two-Parameter Partial Credit Models.

Peer reviewed

Fitzpatrick, Anne R.; And Others – Journal of Educational Measurement, 1996

One-parameter (1PPC) and two-parameter partial credit (2PPC) models were compared using real and simulated data with constructed response items present. Results suggest that the more flexible three-parameter logistic-2PPC model combination produces better model fit than the combination of the one-parameter logistic and the 1PPC models. (SLD)

Descriptors: Comparative Analysis, Constructed Response, Goodness of Fit, Performance Based Assessment

Direct and Indirect Equating: A Comparison of Four Methods Using the Rasch Model.

Download full text

Morrison, Carol A.; Fitzpatrick, Steven J. – 1992

An attempt was made to determine which item response theory (IRT) equating method results in the least amount of equating error or "scale drift" when equating scores across one or more test forms. An internal anchor test design was employed with five different test forms, each consisting of 30 items, 10 in common with the base test and 5…

Descriptors: Comparative Analysis, Computer Simulation, Equated Scores, Error of Measurement

Attali, Yigal	1
Bell, Z. Kade	1
Bodily, Kent D.	1
Burket, George R.	1
Chang, Hua-Hua	1
Custer, Michael	1
DuVernet, Amy M.	1
Duxbury, Scott W.	1
Eglash, Ron	1
Fitzpatrick, Anne R.	1
Fitzpatrick, Steven J.	1
Jackson, Carol	1
Johnson, Emily C.	1
Kang, Hyeon-Ah	1
Kim, Stella Yun	1
Krishnamoorthy, Mukkai	1
Lee, Won-Chan	1
Liou, Michelle	1
Lu, Ying	1
Meade, Adam W.	1
Morrison, Carol A.	1
Omar, Md Hafidz	1
Pomplun, Mark	1
Saldivia, Luis	1
More ▼