NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20252
Since 20245
Since 2021 (last 5 years)8
Since 2016 (last 10 years)19
Since 2006 (last 20 years)42
Audience
Laws, Policies, & Programs
No Child Left Behind Act 20011
What Works Clearinghouse Rating
Showing 1 to 15 of 42 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Stefanie A. Wind; Benjamin Lugu; Yurou Wang – International Journal of Testing, 2025
Mokken Scale Analysis (MSA) is a nonparametric approach that offers exploratory tools for understanding the nature of item responses while emphasizing invariance requirements. MSA is often discussed as it relates to Rasch measurement theory, which also emphasizes invariance, but uses parametric models. Researchers who have compared and combined…
Descriptors: Item Response Theory, Scaling, Surveys, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Tülin Otbiçer Acar – Measurement: Interdisciplinary Research and Perspectives, 2024
The aim of this study is to compare the results of correlation coefficient estimation of reliability with those obtained through the Bland-Altman plot technique. The scale was first divided into two halves using three different approaches. A linear and high-level relationship was found between the scale scores obtained from the halved forms.…
Descriptors: High School Students, Measurement Techniques, Psychometrics, Comparative Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Ke-Hai Yuan; Ling Ling; Zhiyong Zhang – Grantee Submission, 2024
Data in social and behavioral sciences typically contain measurement errors and do not have predefined metrics. Structural equation modeling (SEM) is widely used for the analysis of such data, where the scales of the manifest and latent variables are often subjective. This article studies how the model, parameter estimates, their standard errors…
Descriptors: Structural Equation Models, Computation, Social Science Research, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Ke-Hai Yuan; Ling Ling; Zhiyong Zhang – Structural Equation Modeling: A Multidisciplinary Journal, 2024
Data in social and behavioral sciences typically contain measurement errors and do not have predefined metrics. Structural equation modeling (SEM) is widely used for the analysis of such data, where the scales of the manifest and latent variables are often subjective. This article studies how the model, parameter estimates, their standard errors…
Descriptors: Structural Equation Models, Computation, Social Science Research, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Little, Todd D.; Bontempo, Daniel; Rioux, Charlie; Tracy, Allison – International Journal of Research & Method in Education, 2022
Multilevel modelling (MLM) is the most frequently used approach for evaluating interventions with clustered data. MLM, however, has some limitations that are associated with numerous obstacles to model estimation and valid inferences. Longitudinal multiple-group (LMG) modelling is a longstanding approach for testing intervention effects using…
Descriptors: Longitudinal Studies, Hierarchical Linear Modeling, Alternative Assessment, Intervention
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023
This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…
Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Sanford R. Student; Derek C. Briggs; Laurie Davis – Educational Measurement: Issues and Practice, 2025
Vertical scales are frequently developed using common item nonequivalent group linking. In this design, one can use upper-grade, lower-grade, or mixed-grade common items to estimate the linking constants that underlie the absolute measurement of growth. Using the Rasch model and a dataset from Curriculum Associates' i-Ready Diagnostic in math in…
Descriptors: Elementary School Mathematics, Elementary School Students, Middle School Mathematics, Middle School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Dimitrov, Dimiter M. – Educational and Psychological Measurement, 2020
This study presents new models for item response functions (IRFs) in the framework of the D-scoring method (DSM) that is gaining attention in the field of educational and psychological measurement and largescale assessments. In a previous work on DSM, the IRFs of binary items were estimated using a logistic regression model (LRM). However, the LRM…
Descriptors: Item Response Theory, Scoring, True Scores, Scaling
Peer reviewed Peer reviewed
Direct linkDirect link
Litwok, Daniel; Peck, Laura R. – American Journal of Evaluation, 2019
In experimental evaluations of policy interventions, the so-called Bloom adjustment is commonly used to estimate the impact of the treatment on the treated. It does so by rescaling the estimated impact of the intention to treat--that is, the overall treatment-control group difference in outcomes for the entire experimental sample--by the…
Descriptors: Computation, Outcomes of Treatment, Program Evaluation, Scaling
Peer reviewed Peer reviewed
Direct linkDirect link
Zieger, Laura Raffaella; Jerrim, J.; Anders, J.; Shure, N. – Assessment in Education: Principles, Policy & Practice, 2022
The OECD's Programme for International Student Assessment (PISA) has become one of the key studies for evidence-based education policymaking across the globe. PISA has however received a lot of methodological criticism, including how the test scores are created. The aim of this paper is to investigate the so-called 'conditioning model', where…
Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students
Reardon, Sean F.; Ho, Andrew D.; Kalogrides, Demetra – Stanford Center for Education Policy Analysis, 2019
Linking score scales across different tests is considered speculative and fraught, even at the aggregate level (Feuer et al., 1999; Thissen, 2007). We introduce and illustrate validation methods for aggregate linkages, using the challenge of linking U.S. school district average test scores across states as a motivating example. We show that…
Descriptors: Test Validity, Evaluation Methods, School Districts, Scores
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Wang, Lin; Qian, Jiahe; Lee, Yi-Hsuan – ETS Research Report Series, 2018
Educational assessment data are often collected from a set of test centers across various geographic regions, and therefore the data samples contain clusters. Such cluster-based data may result in clustering effects in variance estimation. However, in many grouped jackknife variance estimation applications, jackknife groups are often formed by a…
Descriptors: Item Response Theory, Scaling, Equated Scores, Cluster Grouping
Peer reviewed Peer reviewed
Direct linkDirect link
Willse, John T. – Measurement and Evaluation in Counseling and Development, 2017
This article provides a brief introduction to the Rasch model. Motivation for using Rasch analyses is provided. Important Rasch model concepts and key aspects of result interpretation are introduced, with major points reinforced using a simulation demonstration. Concrete guidelines are provided regarding sample size and the evaluation of items.
Descriptors: Item Response Theory, Test Results, Test Interpretation, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Cheek, Kim A. – Research in Science Education, 2017
Ideas about temporal (and spatial) scale impact students' understanding across science disciplines. Learners have difficulty comprehending the long time periods associated with natural processes because they have no referent for the magnitudes involved. When people have a good "feel" for quantity, they estimate cardinal number magnitude…
Descriptors: Foreign Countries, Scientific Concepts, Science Education, Spatial Ability
Peer reviewed Peer reviewed
Direct linkDirect link
Hidalgo, Ma Dolores; Benítez, Isabel; Padilla, Jose-Luis; Gómez-Benito, Juana – Sociological Methods & Research, 2017
The growing use of scales in survey questionnaires warrants the need to address how does polytomous differential item functioning (DIF) affect observed scale score comparisons. The aim of this study is to investigate the impact of DIF on the type I error and effect size of the independent samples t-test on the observed total scale scores. A…
Descriptors: Test Items, Test Bias, Item Response Theory, Surveys
Previous Page | Next Page »
Pages: 1  |  2  |  3