NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 15 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Yanxuan Qu; Sandip Sinharay – ETS Research Report Series, 2024
The goal of this paper is to find better ways to estimate the internal consistency reliability of scores on tests with a specific type of design that are often encountered in practice: tests with constructed-response items clustered into sections that are not parallel or tau-equivalent, and one of the sections has only one item. To estimate the…
Descriptors: Test Reliability, Essay Tests, Construct Validity, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Hung-Yu Huang – Educational and Psychological Measurement, 2025
The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent…
Descriptors: Response Style (Tests), Psychological Characteristics, Item Response Theory, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Xijuan Zhang; Hao Wu – Structural Equation Modeling: A Multidisciplinary Journal, 2024
A full structural equation model (SEM) typically consists of both a measurement model (describing relationships between latent variables and observed scale items) and a structural model (describing relationships among latent variables). However, often researchers are primarily interested in testing hypotheses related to the structural model while…
Descriptors: Structural Equation Models, Goodness of Fit, Robustness (Statistics), Factor Structure
Peer reviewed Peer reviewed
Direct linkDirect link
Duane Knudson – Measurement in Physical Education and Exercise Science, 2025
Small sample sizes contribute to several problems in research and knowledge advancement. This conceptual replication study confirmed and extended the inflation of type II errors and confidence intervals in correlation analyses of small sample sizes common in kinesiology/exercise science. Current population data (N = 18, 230, & 464) on four…
Descriptors: Kinesiology, Exercise, Biomechanics, Movement Education
Peer reviewed Peer reviewed
Direct linkDirect link
Daniel McNeish; Melissa G. Wolf – Structural Equation Modeling: A Multidisciplinary Journal, 2024
Despite the popularity of traditional fit index cutoffs like RMSEA [less than or equal to] 0.06 and CFI [greater than or equal to] 0.95, several studies have noted issues with overgeneralizing traditional cutoffs. Computational methods have been proposed to avoid overgeneralization by deriving cutoffs specifically tailored to the characteristics…
Descriptors: Structural Equation Models, Cutting Scores, Generalizability Theory, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Bang Quan Zheng; Peter M. Bentler – Structural Equation Modeling: A Multidisciplinary Journal, 2025
This paper aims to advocate for a balanced approach to model fit evaluation in structural equation modeling (SEM). The ongoing debate surrounding chi-square test statistics and fit indices has been characterized by ambiguity and controversy. Despite the acknowledged limitations of relying solely on the chi-square test, its careful application can…
Descriptors: Monte Carlo Methods, Structural Equation Models, Goodness of Fit, Robustness (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Dandan Tang; Steven M. Boker; Xin Tong – Structural Equation Modeling: A Multidisciplinary Journal, 2025
The replication crisis in social and behavioral sciences has raised concerns about the reliability and validity of empirical studies. While research in the literature has explored contributing factors to this crisis, the issues related to analytical tools have received less attention. This study focuses on a widely used analytical tool -…
Descriptors: Test Validity, Factor Analysis, Replication (Evaluation), Social Science Research
Peer reviewed Peer reviewed
Direct linkDirect link
Anders Holm; Anders Hjorth-Trolle; Robert Andersen – Sociological Methods & Research, 2025
Lagged dependent variables (LDVs) are often used as predictors in ordinary least squares (OLS) models in the social sciences. Although several estimators are commonly employed, little is known about their relative merits in the presence of classical measurement error and different longitudinal processes. We assess the performance of four commonly…
Descriptors: Elementary Education, Scores, Error of Measurement, Predictor Variables
Peer reviewed Peer reviewed
Direct linkDirect link
Zhang, Zhiyong; Yuan, Ke-Hai – Educational and Psychological Measurement, 2016
Cronbach's coefficient alpha is a widely used reliability measure in social, behavioral, and education sciences. It is reported in nearly every study that involves measuring a construct through multiple items. With non-tau-equivalent items, McDonald's omega has been used as a popular alternative to alpha in the literature. Traditional estimation…
Descriptors: Computation, Statistical Analysis, Robustness (Statistics), Error of Measurement
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Zhang, Zhiyong; Yuan, Ke-Hai – Grantee Submission, 2016
Cronbach's coefficient alpha is a widely used reliability measure in social, behavioral, and education sciences. It is reported in nearly every study that involves measuring a construct through multiple items. With non-tau-equivalent items, McDonald's omega has been used as a popular alternative to alpha in the literature. Traditional estimation…
Descriptors: Computation, Error of Measurement, Robustness (Statistics), Statistical Analysis
Luke W. Miratrix; Jasjeet S. Sekhon; Alexander G. Theodoridis; Luis F. Campos – Grantee Submission, 2018
The popularity of online surveys has increased the prominence of using weights that capture units' probabilities of inclusion for claims of representativeness. Yet, much uncertainty remains regarding how these weights should be employed in analysis of survey experiments: Should they be used or ignored? If they are used, which estimators are…
Descriptors: Online Surveys, Weighted Scores, Data Interpretation, Robustness (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Zumrawi, Abdel Azim; Bates, Simon P.; Schroeder, Marianne – Educational Research and Evaluation, 2014
This paper addresses the determination of statistically desirable response rates in students' surveys, with emphasis on assessing the effect of underlying variability in the student evaluation of teaching (SET). We discuss factors affecting the determination of adequate response rates and highlight challenges caused by non-response and lack of…
Descriptors: Inferences, Test Reliability, Response Rates (Questionnaires), Student Evaluation of Teacher Performance
Peer reviewed Peer reviewed
Direct linkDirect link
Gardner, John – Oxford Review of Education, 2013
Evidence from recent research suggests that in the UK the public perception of errors in national examinations is that they are simply mistakes; events that are preventable. This perception predominates over the more sophisticated technical view that errors arise from many sources and create an inevitable variability in assessment outcomes. The…
Descriptors: Educational Assessment, Public Opinion, Error of Measurement, Foreign Countries
Setzer, J. Carl; He, Yi – GED Testing Service, 2009
Reliability Analysis for the Internationally Administered 2002 Series GED (General Educational Development) Tests Reliability refers to the consistency, or stability, of test scores when the authors administer the measurement procedure repeatedly to groups of examinees (American Educational Research Association [AERA], American Psychological…
Descriptors: Educational Research, Error of Measurement, Scores, Test Reliability
Froman, Terry – Research Services, Miami-Dade County Public Schools, 2007
Because 3rd Grade Florida Comprehensive Assessment Test (FCAT) scores have a direct impact on promotion, the results for that grade level are released early by the State. When the FCAT results for 3rd Grade were released in May 2007, many people were troubled. Over 80% of the elementary schools in the Miami-Dade School District showed a decrease…
Descriptors: Reading Achievement, Scoring, Grade 3, Academic Achievement