Publication Date
In 2025 | 0 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 8 |
Since 2016 (last 10 years) | 23 |
Since 2006 (last 20 years) | 37 |
Descriptor
Error of Measurement | 39 |
Guidelines | 39 |
Comparative Analysis | 12 |
Correlation | 10 |
Simulation | 9 |
Sample Size | 8 |
Item Analysis | 7 |
Item Response Theory | 7 |
Models | 7 |
Probability | 7 |
Test Items | 7 |
More ▼ |
Source
Author
Lee, Won-Chan | 2 |
Alonzo, Julie | 1 |
Andersson, Björn | 1 |
Avi Feller | 1 |
Bateman, Andrea | 1 |
Bergeron, Renee | 1 |
Brennan, Robert L. | 1 |
Carlin, Bradley P. | 1 |
Castellano, Katherine E. | 1 |
Chang, Heesun | 1 |
Cheema, Jehanzeb R. | 1 |
More ▼ |
Publication Type
Journal Articles | 32 |
Reports - Research | 23 |
Reports - Evaluative | 7 |
Reports - Descriptive | 6 |
Tests/Questionnaires | 4 |
Dissertations/Theses -… | 2 |
Information Analyses | 2 |
Numerical/Quantitative Data | 2 |
Speeches/Meeting Papers | 1 |
Education Level
Higher Education | 4 |
Postsecondary Education | 4 |
Elementary Education | 3 |
Secondary Education | 3 |
Elementary Secondary Education | 2 |
Grade 4 | 2 |
Grade 5 | 2 |
Grade 6 | 2 |
Grade 7 | 2 |
Intermediate Grades | 2 |
Junior High Schools | 2 |
More ▼ |
Audience
Researchers | 1 |
Location
American Samoa | 1 |
Australia | 1 |
District of Columbia | 1 |
Ethiopia | 1 |
Europe | 1 |
Guam | 1 |
Mississippi | 1 |
Northern Mariana Islands | 1 |
Portugal | 1 |
Puerto Rico | 1 |
Singapore | 1 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 1 |
Assessments and Surveys
Test of English as a Foreign… | 2 |
International English… | 1 |
Test of English for… | 1 |
What Works Clearinghouse Rating
Teck Kiang Tan – Practical Assessment, Research & Evaluation, 2024
The procedures of carrying out factorial invariance to validate a construct were well developed to ensure the reliability of the construct that can be used across groups for comparison and analysis, yet mainly restricted to the frequentist approach. This motivates an update to incorporate the growing Bayesian approach for carrying out the Bayesian…
Descriptors: Bayesian Statistics, Factor Analysis, Programming Languages, Reliability
Oscar Clivio; Avi Feller; Chris Holmes – Grantee Submission, 2024
Reweighting a distribution to minimize a distance to a target distribution is a powerful and flexible strategy for estimating a wide range of causal effects, but can be challenging in practice because optimal weights typically depend on knowledge of the underlying data generating process. In this paper, we focus on design-based weights, which do…
Descriptors: Evaluation Methods, Causal Models, Error of Measurement, Guidelines
Wallin, Gabriel; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2023
This study explores the usefulness of covariates on equating test scores from nonequivalent test groups. The covariates are captured by an estimated propensity score, which is used as a proxy for latent ability to balance the test groups. The objective is to assess the sensitivity of the equated scores to various misspecifications in the…
Descriptors: Models, Error of Measurement, Robustness (Statistics), Equated Scores
Demarest, Leila; Langer, Arnim – Sociological Methods & Research, 2022
While conflict event data sets are increasingly used in contemporary conflict research, important concerns persist regarding the quality of the collected data. Such concerns are not necessarily new. Yet, because the methodological debate and evidence on potential errors remains scattered across different subdisciplines of social sciences, there is…
Descriptors: Guidelines, Research Methodology, Conflict, Social Science Research
Nathaniel Josephs; Dennis M. Feehan; Forrest W. Crawford – Sociological Methods & Research, 2024
The network scale-up method (NSUM) is a survey-based method for estimating the number of individuals in a hidden or hard-to-reach subgroup of a general population. In NSUM surveys, sampled individuals report how many others they know in the subpopulation of interest (e.g. "How many sex workers do you know?") and how many others they know…
Descriptors: Sample Size, Surveys, Population Groups, Epidemiology
Wang, Shaojie; Zhang, Minqiang; Lee, Won-Chan; Huang, Feifei; Li, Zonglong; Li, Yixing; Yu, Sufang – Journal of Educational Measurement, 2022
Traditional IRT characteristic curve linking methods ignore parameter estimation errors, which may undermine the accuracy of estimated linking constants. Two new linking methods are proposed that take into account parameter estimation errors. The item- (IWCC) and test-information-weighted characteristic curve (TWCC) methods employ weighting…
Descriptors: Item Response Theory, Error of Measurement, Accuracy, Monte Carlo Methods
Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022
Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…
Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations
van Zundert, Camiel H. J.; Miocevic, Milica – Research Synthesis Methods, 2020
Synthesizing findings about the indirect (mediated) effect plays an important role in determining the mechanism through which variables affect one another. This simulation study compared six methods for synthesizing indirect effects: correlation-based MASEM, parameter-based MASEM, marginal likelihood synthesis, an adjustment to marginal likelihood…
Descriptors: Correlation, Comparative Analysis, Meta Analysis, Bayesian Statistics
Ippel, Lianne; Magis, David – Educational and Psychological Measurement, 2020
In dichotomous item response theory (IRT) framework, the asymptotic standard error (ASE) is the most common statistic to evaluate the precision of various ability estimators. Easy-to-use ASE formulas are readily available; however, the accuracy of some of these formulas was recently questioned and new ASE formulas were derived from a general…
Descriptors: Item Response Theory, Error of Measurement, Accuracy, Standards
Clauser, Brian E.; Kane, Michael; Clauser, Jerome C. – Journal of Educational Measurement, 2020
An Angoff standard setting study generally yields judgments on a number of items by a number of judges (who may or may not be nested in panels). Variability associated with judges (and possibly panels) contributes error to the resulting cut score. The variability associated with items plays a more complicated role. To the extent that the mean item…
Descriptors: Cutting Scores, Generalization, Decision Making, Standard Setting
Chang, Heesun – Language Assessment Quarterly, 2022
Drawing on the framework of invariant measurement from Rasch measurement theory, the purpose of this study is to psychometrically evaluate the 20 language and teaching skill domains of the International Teaching Assistant (ITA) Test using the many-facet Rasch model and to empirically explore performance differences between females and males in…
Descriptors: Teaching Assistants, Grammar, Second Language Learning, Second Language Instruction
Johnson, Donald M.; Shoulders, Catherine W. – Journal of Agricultural Education, 2017
As members of a profession committed to the dissemination of rigorous research pertaining to agricultural education, authors publishing in the Journal of Agricultural Education (JAE) must seek methods to evaluate and, when necessary, improve their research methods. The purpose of this study was to describe how authors of manuscripts published in…
Descriptors: Statistical Analysis, Agricultural Education, Effect Size, Risk
Li, Ming; Harring, Jeffrey R. – Educational and Psychological Measurement, 2017
Researchers continue to be interested in efficient, accurate methods of estimating coefficients of covariates in mixture modeling. Including covariates related to the latent class analysis not only may improve the ability of the mixture model to clearly differentiate between subjects but also makes interpretation of latent group membership more…
Descriptors: Simulation, Comparative Analysis, Monte Carlo Methods, Guidelines
Willse, John T. – Measurement and Evaluation in Counseling and Development, 2017
This article provides a brief introduction to the Rasch model. Motivation for using Rasch analyses is provided. Important Rasch model concepts and key aspects of result interpretation are introduced, with major points reinforced using a simulation demonstration. Concrete guidelines are provided regarding sample size and the evaluation of items.
Descriptors: Item Response Theory, Test Results, Test Interpretation, Simulation
Ryoo, Ji Hoon; Tai, Robert H.; Skeeles-Worley, Angela D. – Research in Science Education, 2020
In longitudinal studies, measurement invariance is required to conduct substantive comparisons over time or across groups. In this study, we examined measurement invariance on a recently developed instrument capturing student preferences for seven instructional strategies related to science learning and career interest. We have labeled these seven…
Descriptors: Guidelines, Longitudinal Studies, Comparative Analysis, Error of Measurement