ERIC - Search Results

Publication Date

In 2025	5
Since 2024	8
Since 2021 (last 5 years)	8
Since 2016 (last 10 years)	11
Since 2006 (last 20 years)	15

Descriptor

Error of Measurement	15
Robustness (Statistics)	15
Test Reliability	15
Goodness of Fit	6
Structural Equation Models	4
Predictor Variables	3
Test Validity	3
Achievement Rating	2
Behavioral Science Research	2
Computation	2
Evaluation Problems	2
Foreign Countries	2
Measurement Techniques	2
Online Surveys	2
Replication (Evaluation)	2
Response Style (Tests)	2
Scores	2
Social Science Research	2
Statistical Analysis	2
Statistical Bias	2
Test Construction	2
Academic Achievement	1
Academic Standards	1
Accuracy	1
Achievement Gains	1
More ▼

Source

Structural Equation Modeling:…	4
Educational and Psychological…	2
Grantee Submission	2
ETS Research Report Series	1
Educational Research and…	1
GED Testing Service	1
Measurement in Physical…	1
Oxford Review of Education	1
Research Services, Miami-Dade…	1
Sociological Methods &…	1

Publication Type

Journal Articles	12
Reports - Research	11
Reports - Evaluative	3
Reports - Descriptive	1

Education Level

Elementary Secondary Education	2
Higher Education	2
Elementary Education	1
Grade 3	1
Postsecondary Education	1

Audience

Researchers

Location

Canada	1
Florida	1
United Kingdom	1
Virginia	1

Laws, Policies, & Programs

Assessments and Surveys

Florida Comprehensive…	1
General Educational…	1

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Estimating Reliability for Tests with One Constructed-Response Item in a Section. Research Report. ETS RR-24-07

Peer reviewed
PDF on ERIC

Download full text

Yanxuan Qu; Sandip Sinharay – ETS Research Report Series, 2024

The goal of this paper is to find better ways to estimate the internal consistency reliability of scores on tests with a specific type of design that are often encountered in practice: tests with constructed-response items clustered into sections that are not parallel or tau-equivalent, and one of the sections has only one item. To estimate the…

Descriptors: Test Reliability, Essay Tests, Construct Validity, Error of Measurement

Exploring the Influence of Response Styles on Continuous Scale Assessments: Insights from a Novel Modeling Approach

Peer reviewed

Direct link

Hung-Yu Huang – Educational and Psychological Measurement, 2025

The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent…

Descriptors: Response Style (Tests), Psychological Characteristics, Item Response Theory, Test Reliability

Investigating Structural Model Fit Evaluation

Peer reviewed

Direct link

Xijuan Zhang; Hao Wu – Structural Equation Modeling: A Multidisciplinary Journal, 2024

A full structural equation model (SEM) typically consists of both a measurement model (describing relationships between latent variables and observed scale items) and a structural model (describing relationships among latent variables). However, often researchers are primarily interested in testing hypotheses related to the structural model while…

Descriptors: Structural Equation Models, Goodness of Fit, Robustness (Statistics), Factor Structure

Confirming Increased Statistical Errors in Testing Correlations from Small Sample Sizes

Peer reviewed

Direct link

Duane Knudson – Measurement in Physical Education and Exercise Science, 2025

Small sample sizes contribute to several problems in research and knowledge advancement. This conceptual replication study confirmed and extended the inflation of type II errors and confidence intervals in correlation analyses of small sample sizes common in kinesiology/exercise science. Current population data (N = 18, 230, & 464) on four…

Descriptors: Kinesiology, Exercise, Biomechanics, Movement Education

Direct Discrepancy Dynamic Fit Index Cutoffs for Arbitrary Covariance Structure Models

Peer reviewed

Direct link

Daniel McNeish; Melissa G. Wolf – Structural Equation Modeling: A Multidisciplinary Journal, 2024

Despite the popularity of traditional fit index cutoffs like RMSEA [less than or equal to] 0.06 and CFI [greater than or equal to] 0.95, several studies have noted issues with overgeneralizing traditional cutoffs. Computational methods have been proposed to avoid overgeneralization by deriving cutoffs specifically tailored to the characteristics…

Descriptors: Structural Equation Models, Cutting Scores, Generalizability Theory, Error of Measurement

Enhancing Model Fit Evaluation in SEM: Practical Tips for Optimizing Chi-Square Tests

Peer reviewed

Direct link

Bang Quan Zheng; Peter M. Bentler – Structural Equation Modeling: A Multidisciplinary Journal, 2025

This paper aims to advocate for a balanced approach to model fit evaluation in structural equation modeling (SEM). The ongoing debate surrounding chi-square test statistics and fit indices has been characterized by ambiguity and controversy. Despite the acknowledged limitations of relying solely on the chi-square test, its careful application can…

Descriptors: Monte Carlo Methods, Structural Equation Models, Goodness of Fit, Robustness (Statistics)

Are the Signs of Factor Loadings Arbitrary in Confirmatory Factor Analysis? Problems and Solutions

Peer reviewed

Direct link

Dandan Tang; Steven M. Boker; Xin Tong – Structural Equation Modeling: A Multidisciplinary Journal, 2025

The replication crisis in social and behavioral sciences has raised concerns about the reliability and validity of empirical studies. While research in the literature has explored contributing factors to this crisis, the issues related to analytical tools have received less attention. This study focuses on a widely used analytical tool -…

Descriptors: Test Validity, Factor Analysis, Replication (Evaluation), Social Science Research

Lagged Dependent Variable Predictors, Classical Measurement Error, and Path Dependency: The Conditions under Which Various Estimators Are Appropriate

Peer reviewed

Direct link

Anders Holm; Anders Hjorth-Trolle; Robert Andersen – Sociological Methods & Research, 2025

Lagged dependent variables (LDVs) are often used as predictors in ordinary least squares (OLS) models in the social sciences. Although several estimators are commonly employed, little is known about their relative merits in the presence of classical measurement error and different longitudinal processes. We assess the performance of four commonly…

Descriptors: Elementary Education, Scores, Error of Measurement, Predictor Variables

Robust Coefficients Alpha and Omega and Confidence Intervals with Outlying Observations and Missing Data: Methods and Software

Peer reviewed

Direct link

Zhang, Zhiyong; Yuan, Ke-Hai – Educational and Psychological Measurement, 2016

Cronbach's coefficient alpha is a widely used reliability measure in social, behavioral, and education sciences. It is reported in nearly every study that involves measuring a construct through multiple items. With non-tau-equivalent items, McDonald's omega has been used as a popular alternative to alpha in the literature. Traditional estimation…

Descriptors: Computation, Statistical Analysis, Robustness (Statistics), Error of Measurement

Robust Coefficients Alpha and Omega and Confidence Intervals with Outlying Observations and Missing Data Methods and Software

Peer reviewed
PDF on ERIC

Download full text

Zhang, Zhiyong; Yuan, Ke-Hai – Grantee Submission, 2016

Descriptors: Computation, Error of Measurement, Robustness (Statistics), Statistical Analysis

Worth Weighting? How to Think about and Use Weights in Survey Experiments

Peer reviewed
PDF on ERIC

Download full text

Direct link

Luke W. Miratrix; Jasjeet S. Sekhon; Alexander G. Theodoridis; Luis F. Campos – Grantee Submission, 2018

The popularity of online surveys has increased the prominence of using weights that capture units' probabilities of inclusion for claims of representativeness. Yet, much uncertainty remains regarding how these weights should be employed in analysis of survey experiments: Should they be used or ignored? If they are used, which estimators are…

Descriptors: Online Surveys, Weighted Scores, Data Interpretation, Robustness (Statistics)

What Response Rates Are Needed to Make Reliable Inferences from Student Evaluations of Teaching?

Peer reviewed

Direct link

Zumrawi, Abdel Azim; Bates, Simon P.; Schroeder, Marianne – Educational Research and Evaluation, 2014

This paper addresses the determination of statistically desirable response rates in students' surveys, with emphasis on assessing the effect of underlying variability in the student evaluation of teaching (SET). We discuss factors affecting the determination of adequate response rates and highlight challenges caused by non-response and lack of…

Descriptors: Inferences, Test Reliability, Response Rates (Questionnaires), Student Evaluation of Teacher Performance

The Public Understanding of Error in Educational Assessment

Peer reviewed

Direct link

Gardner, John – Oxford Review of Education, 2013

Evidence from recent research suggests that in the UK the public perception of errors in national examinations is that they are simply mistakes; events that are preventable. This perception predominates over the more sophisticated technical view that errors arise from many sources and create an inevitable variability in assessment outcomes. The…

Descriptors: Educational Assessment, Public Opinion, Error of Measurement, Foreign Countries

Reliability Analysis for the Internationally Administered 2002 Series GED Tests. GED Testing Service[R] Research Studies, 2009-3

Download full text

Setzer, J. Carl; He, Yi – GED Testing Service, 2009

Reliability Analysis for the Internationally Administered 2002 Series GED (General Educational Development) Tests Reliability refers to the consistency, or stability, of test scores when the authors administer the measurement procedure repeatedly to groups of examinees (American Educational Research Association [AERA], American Psychological…

Descriptors: Educational Research, Error of Measurement, Scores, Test Reliability

Is There Something Else Wrong with the FCAT? A Closer Look at Reading Gains in Middle School. Research Brief. Volume 0701

Download full text

Froman, Terry – Research Services, Miami-Dade County Public Schools, 2007

Because 3rd Grade Florida Comprehensive Assessment Test (FCAT) scores have a direct impact on promotion, the results for that grade level are released early by the State. When the FCAT results for 3rd Grade were released in May 2007, many people were troubled. Over 80% of the elementary schools in the Miami-Dade School District showed a decrease…

Descriptors: Reading Achievement, Scoring, Grade 3, Academic Achievement

Yuan, Ke-Hai	2
Zhang, Zhiyong	2
Alexander G. Theodoridis	1
Anders Hjorth-Trolle	1
Anders Holm	1
Bang Quan Zheng	1
Bates, Simon P.	1
Dandan Tang	1
Daniel McNeish	1
Duane Knudson	1
Froman, Terry	1
Gardner, John	1
Hao Wu	1
He, Yi	1
Hung-Yu Huang	1
Jasjeet S. Sekhon	1
Luis F. Campos	1
Luke W. Miratrix	1
Melissa G. Wolf	1
Peter M. Bentler	1
Robert Andersen	1
Sandip Sinharay	1
Schroeder, Marianne	1
Setzer, J. Carl	1
Steven M. Boker	1
More ▼