ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	5
Since 2016 (last 10 years)	7
Since 2006 (last 20 years)	10

Descriptor

Error of Measurement	14
Evaluation Methods	14
Statistical Inference	14
Simulation	5
Statistical Bias	5
Hypothesis Testing	4
Statistical Significance	4
Computation	3
Data Analysis	3
Educational Research	3
Foreign Countries	3
Intervention	3
Research Problems	3
Accuracy	2
Causal Models	2
Classification	2
Comparative Analysis	2
Correlation	2
Decision Making	2
Effect Size	2
Evaluation Research	2
Evidence	2
Longitudinal Studies	2
Measurement Techniques	2
Monte Carlo Methods	2
More ▼

Source

Grantee Submission	2
Educational and Psychological…	1
Journal of Educational and…	1
Journal of Research on…	1
Measurement and Evaluation in…	1
Multivariate Behavioral…	1
National Center for Education…	1
Sociological Methods &…	1
Structural Equation Modeling	1
Structural Equation Modeling:…	1
What Works Clearinghouse	1
More ▼

Publication Type

Reports - Research	11
Journal Articles	8
Speeches/Meeting Papers	3
Guides - Non-Classroom	2
Opinion Papers	1
Reports - Descriptive	1

Education Level

Secondary Education	2
Adult Education	1
Elementary Secondary Education	1
Grade 9	1
High Schools	1
Junior High Schools	1
Middle Schools	1

Audience

Researchers

Location

Germany	1
Netherlands	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Minimal-Effect Testing, Equivalence Testing, and the Conventional Null Hypothesis Testing for the Analysis of Bi-Factor Models

Peer reviewed

Direct link

Shunji Wang; Katerina M. Marcoulides; Jiashan Tang; Ke-Hai Yuan – Structural Equation Modeling: A Multidisciplinary Journal, 2024

A necessary step in applying bi-factor models is to evaluate the need for domain factors with a general factor in place. The conventional null hypothesis testing (NHT) was commonly used for such a purpose. However, the conventional NHT meets challenges when the domain loadings are weak or the sample size is insufficient. This article proposes…

Descriptors: Hypothesis Testing, Error of Measurement, Comparative Analysis, Monte Carlo Methods

Towards Representation Learning for Weighting Problems in Design-Based Causal Inference

Peer reviewed

Direct link

Oscar Clivio; Avi Feller; Chris Holmes – Grantee Submission, 2024

Reweighting a distribution to minimize a distance to a target distribution is a powerful and flexible strategy for estimating a wide range of causal effects, but can be challenging in practice because optimal weights typically depend on knowledge of the underlying data generating process. In this paper, we focus on design-based weights, which do…

Descriptors: Evaluation Methods, Causal Models, Error of Measurement, Guidelines

Synthetic Controls with Staggered Adoption

Peer reviewed
PDF on ERIC

Download full text

Direct link

Ben-Michael, Eli; Feller, Avi; Rothstein, Jesse – Grantee Submission, 2022

Staggered adoption of policies by different units at different times creates promising opportunities for observational causal inference. Estimation remains challenging, however, and common regression methods can give misleading results. A promising alternative is the synthetic control method (SCM), which finds a weighted average of control units…

Descriptors: Causal Models, Statistical Inference, Computation, Evaluation Methods

The BASIE (BAyeSian Interpretation of Estimates) Framework for Interpreting Findings from Impact Evaluations: A Practical Guide for Education Researchers. Toolkit. NCEE 2022-005

Peer reviewed
PDF on ERIC

Download full text

Deke, John; Finucane, Mariel; Thal, Daniel – National Center for Education Evaluation and Regional Assistance, 2022

BASIE is a framework for interpreting impact estimates from evaluations. It is an alternative to null hypothesis significance testing. This guide walks researchers through the key steps of applying BASIE, including selecting prior evidence, reporting impact estimates, interpreting impact estimates, and conducting sensitivity analyses. The guide…

Descriptors: Bayesian Statistics, Educational Research, Data Interpretation, Hypothesis Testing

On the Treatment of Missing Data in Background Questionnaires in Educational Large-Scale Assessments: An Evaluation of Different Procedures

Peer reviewed

Direct link

Grund, Simon; Lüdtke, Oliver; Robitzsch, Alexander – Journal of Educational and Behavioral Statistics, 2021

Large-scale assessments (LSAs) use Mislevy's "plausible value" (PV) approach to relate student proficiency to noncognitive variables administered in a background questionnaire. This method requires background variables to be completely observed, a requirement that is seldom fulfilled. In this article, we evaluate and compare the…

Descriptors: Data Analysis, Error of Measurement, Research Problems, Statistical Inference

Estimating Causal Effects of Education Interventions Using a Two-Rating Regression Discontinuity Design: Lessons from a Simulation Study and an Application

Peer reviewed

Direct link

Porter, Kristin E.; Reardon, Sean F.; Unlu, Fatih; Bloom, Howard S.; Cimpian, Joseph R. – Journal of Research on Educational Effectiveness, 2017

A valuable extension of the single-rating regression discontinuity design (RDD) is a multiple-rating RDD (MRRDD). To date, four main methods have been used to estimate average treatment effects at the multiple treatment frontiers of an MRRDD: the "surface" method, the "frontier" method, the "binding-score" method, and…

Descriptors: Regression (Statistics), Intervention, Quasiexperimental Design, Simulation

Evaluating Bias of Sequential Mixed-Mode Designs against Benchmark Surveys

Peer reviewed

Direct link

Klausch, Thomas; Schouten, Barry; Hox, Joop J. – Sociological Methods & Research, 2017

This study evaluated three types of bias--total, measurement, and selection bias (SB)--in three sequential mixed-mode designs of the Dutch Crime Victimization Survey: telephone, mail, and web, where nonrespondents were followed up face-to-face (F2F). In the absence of true scores, all biases were estimated as mode effects against two different…

Descriptors: Evaluation Methods, Statistical Bias, Sequential Approach, Benchmarking

Taking the Missing Propensity into Account When Estimating Competence Scores: Evaluation of Item Response Theory Models for Nonignorable Omissions

Peer reviewed

Direct link

Köhler, Carmen; Pohl, Steffi; Carstensen, Claus H. – Educational and Psychological Measurement, 2015

When competence tests are administered, subjects frequently omit items. These missing responses pose a threat to correctly estimating the proficiency level. Newer model-based approaches aim to take nonignorable missing data processes into account by incorporating a latent missing propensity into the measurement model. Two assumptions are typically…

Descriptors: Competence, Tests, Evaluation Methods, Adults

What Works Clearinghouse Procedures and Standards Handbook, Version 3.0

Peer reviewed
PDF on ERIC

Download full text

What Works Clearinghouse, 2014

This "What Works Clearinghouse Procedures and Standards Handbook (Version 3.0)" provides a detailed description of the standards and procedures of the What Works Clearinghouse (WWC). The remaining chapters of this Handbook are organized to take the reader through the basic steps that the WWC uses to develop a review protocol, identify…

Descriptors: Educational Research, Guides, Intervention, Classification

Commentary: Are Three Waves of Data Sufficient for Assessing Mediation?

Peer reviewed

Direct link

Reichardt, Charles S. – Multivariate Behavioral Research, 2011

Maxwell, Cole, and Mitchell (2011) demonstrated that simple structural equation models, when used with cross-sectional data, generally produce biased estimates of meditated effects. I extend those results by showing how simple structural equation models can produce biased estimates of meditated effects when used even with longitudinal data. Even…

Descriptors: Structural Equation Models, Statistical Data, Longitudinal Studies, Error of Measurement

A Proposed New "What if Reliability" Analysis for Assessing the Statistical Significance of Bivariate Relationships

Peer reviewed

Onwuegbuzie, Anthony J.; Roberts, J. Kyle; Daniel, Larry G. – Measurement and Evaluation in Counseling and Development, 2005

In this article, the authors (a) illustrate how displaying disattenuated correlation coefficients alongside their unadjusted counterparts will allow researchers to assess the impact of unreliability on bivariate relationships and (b) demonstrate how a proposed new "what if reliability" analysis can complement null hypothesis significance…

Descriptors: Correlation, Statistical Significance, Reliability, Error of Measurement

Evaluation of the Magnitude of Differential Item Functioning in Polytomous Items. Program Statistics Research Technical Report No. 94-2.

Download full text

Zwick, Rebecca; Thayer, Dorothy T. – 1994

Several recent studies have investigated the application of statistical inference procedures to the analysis of differential item functioning (DIF) in test items that are scored on an ordinal scale. Mantel's extension of the Mantel-Haenszel test is a possible hypothesis-testing method for this purpose. The development of descriptive statistics for…

Descriptors: Error of Measurement, Evaluation Methods, Hypothesis Testing, Item Bias

In Search of Golden Rules: Comment on Hypothesis-Testing Approaches to Setting Cutoff Values for Fit Indexes and Dangers in Overgeneralizing Hu and Bentler's (1999) Findings

Peer reviewed

Direct link

Marsh, Herbert W.; Hau, Kit-Tai; Wen, Zhonglin – Structural Equation Modeling, 2004

Goodness-of-fit (GOF) indexes provide "rules of thumb"?recommended cutoff values for assessing fit in structural equation modeling. Hu and Bentler (1999) proposed a more rigorous approach to evaluating decision rules based on GOF indexes and, on this basis, proposed new and more stringent cutoff values for many indexes. This article discusses…

Descriptors: Statistical Significance, Structural Equation Models, Evaluation Methods, Evaluation Research

Applying Generalizability Theory To Evaluate Treatment Effect in Single-Subject Research.

Download full text

Lefebvre, Daniel J.; Suen, Hoi K. – 1990

An empirical investigation of methodological issues associated with evaluating treatment effect in single-subject research (SSR) designs is presented. This investigation: (1) conducted a generalizability (G) study to identify the sources of systematic and random measurement error (SRME); (2) used an analytic approach based on G theory to integrate…

Descriptors: Classroom Observation Techniques, Disabilities, Educational Research, Error of Measurement

Avi Feller	1
Ben-Michael, Eli	1
Bloom, Howard S.	1
Carstensen, Claus H.	1
Chris Holmes	1
Cimpian, Joseph R.	1
Daniel, Larry G.	1
Deke, John	1
Feller, Avi	1
Finucane, Mariel	1
Grund, Simon	1
Hau, Kit-Tai	1
Hox, Joop J.	1
Jiashan Tang	1
Katerina M. Marcoulides	1
Ke-Hai Yuan	1
Klausch, Thomas	1
Köhler, Carmen	1
Lefebvre, Daniel J.	1
Lüdtke, Oliver	1
Marsh, Herbert W.	1
Onwuegbuzie, Anthony J.	1
Oscar Clivio	1
Pohl, Steffi	1
Porter, Kristin E.	1
More ▼