Publication Date
In 2025 | 8 |
Since 2024 | 13 |
Since 2021 (last 5 years) | 15 |
Since 2016 (last 10 years) | 24 |
Since 2006 (last 20 years) | 51 |
Descriptor
Robustness (Statistics) | 59 |
Test Reliability | 59 |
Test Validity | 28 |
Error of Measurement | 15 |
Evaluation Methods | 11 |
Foreign Countries | 11 |
Item Analysis | 11 |
Measures (Individuals) | 10 |
Evaluation Research | 9 |
Goodness of Fit | 9 |
Evaluation Problems | 8 |
More ▼ |
Source
Author
Yuan, Ke-Hai | 2 |
Zhang, Zhiyong | 2 |
Abela, John | 1 |
Alejandro García-Mas | 1 |
Alessandri, Guido | 1 |
Alexander G. Theodoridis | 1 |
Algina, James | 1 |
Anders Hjorth-Trolle | 1 |
Anders Holm | 1 |
Arkoudis, Sophia | 1 |
Bai, Yu | 1 |
More ▼ |
Publication Type
Journal Articles | 52 |
Reports - Research | 33 |
Reports - Evaluative | 20 |
Reports - Descriptive | 4 |
Information Analyses | 2 |
Dissertations/Theses -… | 1 |
Numerical/Quantitative Data | 1 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 16 |
Elementary Secondary Education | 9 |
Postsecondary Education | 9 |
Secondary Education | 2 |
Elementary Education | 1 |
Grade 10 | 1 |
Grade 3 | 1 |
Grade 4 | 1 |
Grade 7 | 1 |
High Schools | 1 |
Audience
Researchers | 2 |
Location
Tennessee | 2 |
Australia | 1 |
California | 1 |
Canada | 1 |
China | 1 |
District of Columbia | 1 |
Ethiopia | 1 |
Finland (Helsinki) | 1 |
Florida | 1 |
Italy | 1 |
Japan | 1 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Courtney Bell; Jessalynn James; Eric S. Taylor; James Wyckoff – Journal of Policy Analysis and Management, 2025
We study the returns to experience in teaching, estimated using supervisor ratings from classroom observations. We describe the assumptions required to interpret changes in observation ratings over time as the causal effect of experience on performance. We compare two difference-in-differences strategies: the two-way fixed effects estimator common…
Descriptors: Lesson Observation Criteria, Teaching Experience, Teacher Evaluation, Supervisors
Yanxuan Qu; Sandip Sinharay – ETS Research Report Series, 2024
The goal of this paper is to find better ways to estimate the internal consistency reliability of scores on tests with a specific type of design that are often encountered in practice: tests with constructed-response items clustered into sections that are not parallel or tau-equivalent, and one of the sections has only one item. To estimate the…
Descriptors: Test Reliability, Essay Tests, Construct Validity, Error of Measurement
Hung-Yu Huang – Educational and Psychological Measurement, 2025
The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent…
Descriptors: Response Style (Tests), Psychological Characteristics, Item Response Theory, Test Reliability
Xijuan Zhang; Hao Wu – Structural Equation Modeling: A Multidisciplinary Journal, 2024
A full structural equation model (SEM) typically consists of both a measurement model (describing relationships between latent variables and observed scale items) and a structural model (describing relationships among latent variables). However, often researchers are primarily interested in testing hypotheses related to the structural model while…
Descriptors: Structural Equation Models, Goodness of Fit, Robustness (Statistics), Factor Structure
Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025
To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…
Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory
Duane Knudson – Measurement in Physical Education and Exercise Science, 2025
Small sample sizes contribute to several problems in research and knowledge advancement. This conceptual replication study confirmed and extended the inflation of type II errors and confidence intervals in correlation analyses of small sample sizes common in kinesiology/exercise science. Current population data (N = 18, 230, & 464) on four…
Descriptors: Kinesiology, Exercise, Biomechanics, Movement Education
Daniel McNeish; Melissa G. Wolf – Structural Equation Modeling: A Multidisciplinary Journal, 2024
Despite the popularity of traditional fit index cutoffs like RMSEA [less than or equal to] 0.06 and CFI [greater than or equal to] 0.95, several studies have noted issues with overgeneralizing traditional cutoffs. Computational methods have been proposed to avoid overgeneralization by deriving cutoffs specifically tailored to the characteristics…
Descriptors: Structural Equation Models, Cutting Scores, Generalizability Theory, Error of Measurement
Mikkel Helding Vembye; James Eric Pustejovsky; Therese Deocampo Pigott – Research Synthesis Methods, 2024
Sample size and statistical power are important factors to consider when planning a research synthesis. Power analysis methods have been developed for fixed effect or random effects models, but until recently these methods were limited to simple data structures with a single, independent effect per study. Recent work has provided power…
Descriptors: Sample Size, Robustness (Statistics), Effect Size, Social Science Research
Bang Quan Zheng; Peter M. Bentler – Structural Equation Modeling: A Multidisciplinary Journal, 2025
This paper aims to advocate for a balanced approach to model fit evaluation in structural equation modeling (SEM). The ongoing debate surrounding chi-square test statistics and fit indices has been characterized by ambiguity and controversy. Despite the acknowledged limitations of relying solely on the chi-square test, its careful application can…
Descriptors: Monte Carlo Methods, Structural Equation Models, Goodness of Fit, Robustness (Statistics)
Dandan Tang; Steven M. Boker; Xin Tong – Structural Equation Modeling: A Multidisciplinary Journal, 2025
The replication crisis in social and behavioral sciences has raised concerns about the reliability and validity of empirical studies. While research in the literature has explored contributing factors to this crisis, the issues related to analytical tools have received less attention. This study focuses on a widely used analytical tool -…
Descriptors: Test Validity, Factor Analysis, Replication (Evaluation), Social Science Research
Anders Holm; Anders Hjorth-Trolle; Robert Andersen – Sociological Methods & Research, 2025
Lagged dependent variables (LDVs) are often used as predictors in ordinary least squares (OLS) models in the social sciences. Although several estimators are commonly employed, little is known about their relative merits in the presence of classical measurement error and different longitudinal processes. We assess the performance of four commonly…
Descriptors: Elementary Education, Scores, Error of Measurement, Predictor Variables
Brent J. Goertzen; Kaley Klaus – Research & Practice in Assessment, 2023
When evaluating student learning, educators often employ scoring rubrics, for which quality can be determined through evaluating validity and reliability. This article discusses the norming process utilized in a graduate organizational leadership program for a capstone scoring rubric. Concepts of validity and reliability are discussed, as is the…
Descriptors: Graduate Students, Graduate Study, Graduate School Faculty, Scoring Rubrics
Zachary J. Roman; Patrick Schmidt; Jason M. Miller; Holger Brandt – Structural Equation Modeling: A Multidisciplinary Journal, 2024
Careless and insufficient effort responding (C/IER) is a situation where participants respond to survey instruments without considering the item content. This phenomena adds noise to data leading to erroneous inference. There are multiple approaches to identifying and accounting for C/IER in survey settings, of these approaches the best performing…
Descriptors: Structural Equation Models, Bayesian Statistics, Response Style (Tests), Robustness (Statistics)
Ruben Trigueros; Alejandro García-Mas – British Journal of Educational Psychology, 2025
Introduction: In recent years, the incorporation of novelty as a psychological need and the study of the frustration of needs have become a recurring theme in the research on psychological needs in the educational environment. Currently, there are two scales available to assess the frustration of basic psychological needs (FBN) in the context of…
Descriptors: Psychological Patterns, Well Being, Resilience (Psychology), Self Determination
Wang, Xinghua; Wang, Zhuo; Wang, Qiyun; Chen, Wenli; Pi, Zhongling – Journal of Computer Assisted Learning, 2021
Digital competence is critical for university students to adapt to and benefit from digitally enhanced learning. Prior studies on its measurement mostly focus on educators and relied on factor analyses. However, there is a lack of valid and convenient tools to measure university students' digital competence. This study aimed to develop a digital…
Descriptors: Electronic Learning, Technological Literacy, College Students, Measures (Individuals)