NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
Peer reviewed Peer reviewed
Direct linkDirect link
ERIC Number: ED677734
Record Type: Non-Journal
Publication Date: 2025-Oct-11
Pages: N/A
Abstractor: As Provided
ISBN: N/A
ISSN: N/A
EISSN: N/A
Available Date: 0000-00-00
The Energy Distance between Covariate Distributions and Assessments of Generalization
Wendy Chan; Jimin Oh; Katherine J. Strickland
Society for Research on Educational Effectiveness
Background: The generalizability of a study refers to the extent to which the results and inferences from a sample apply to individuals in a larger target population of inference (Shadish et al., 2002). In practice, the strongest tool to facilitate generalizations is random or probability sampling, which is rare in educational studies (Olsen et al., 2013). Statistical research on generalization has focused on improving generalizations using propensity scores, which model the sample selection process as a function of observable covariates (Stuart et al., 2011; Tipton, 2013; O'Muircheartaigh & Hedges, 2014). When designing experimental studies for generalization, researchers often evaluate how the recruited sample is "like" the target population on covariates that are assumed to moderate the treatment impact (Tipton, 2014). In practice, four statistics have been proposed to assess the generalizability or similarity between a sample and population, which include the (1) standardized mean difference (SMD; Stuart et al., 2011) in propensity scores, (2) average SMD in covariates between the sample and population (Tipton et al., 2017), (3) overlap in propensity score distributions (Chan, 2022) and (4) generalizability index (B-index; Tipton, 2014). When these statistics suggest "strong generalizability" between the sample and population, bias-reduced estimation of the PATE is possible. An important question is how differences between the study sample and population, whether it is in the types of covariates or their distributions, affect the distribution of the generalizability statistics and consequently, assessments of generalization. The purpose of the current study is to examine the relationship between the distributional distances of the sample and population covariates and the generalizability statistics. We focus on two types of distances, the energy distance (Székely & Rizzo, 2004) and the Hellinger distance (González-Castro et al., 2013) and how their values affect assessments of generalization. Generalizability Statistics: The SMD in propensity scores/logits is computed by taking the difference in average estimated propensity scores between the sample and population and standardizing this difference using the standard deviation of the population propensity scores. Similarly, the average SMD in covariates is calculated using the standardized differences among the individual covariates, which are then averaged. In general, values close to zero are associated with strong similarity between the sample and population. Overlap is defined as the proportion of units in the population whose estimated propensity scores fall in the range of the sample propensity scores. Finally, the generalizability index (B-index) uses the Bhattacharyya (1943, 1946) distance between the sample and population propensity scores to quantify the similarity. Both overlap and the B-index assume values between zero and one where values close to the one imply that the sample is "highly generalizable," or similar, to the population based on the covariates. Measures of Distance: Research on measures of distributional distance has been extensive (Székely & Rizzo, 2013). The current study focuses on two distance measures: the energy and Hellinger distance and assesses their relationship to the generalizability statistics. Let X be a vector of p observable covariates for schools in a study sample (of size n) and population (of size N) where X [epsilon] R[superscript p]. Let G and H be two finite-mean distribution functions on R[superscript p] and let Z, Z[prime]~iid G and V, V[prime]~iid H. The energy distance between G and H is given by: [equation omitted](1) The function [double vertical line].[double vertical line subscript 2] is the Euclidean norm. The Hellinger distance is a type of f-divergence and similar to the Bhattacharyya distance used to compute the B-index (Hellinger, 1909; Bhattacharyya, 1946). Formally, given two probability distributions Z, V, the Hellinger distance is given by: [equation omitted](2) Both distance measures assume values between 0 and 1, where values close to one are associated with greater distributional differences. These measures provide ways to quantify the difference between covariate distributions in the sample and population in our context. We focus specifically on these two measures as they are sensitive to differences in distributions and the energy distance remains stable under perturbations of the support of the distributions measured (Huling & Mak, 2024). Additionally, both distances are easy and efficient to compute and we center our study on them to understand how different values of the distance measures relate to various assessments of generalization, as quantified by the four generalizability statistics. Simulation Study and Results: We conducted a simulation study to examine the relationship between the energy and Hellinger distances and the generalizability statistics. We fixed the population size to N = 500 and selected samples of size n=0.03N=15, which were parameter choices that were informed by several empirical generalization studies. We varied the number of covariates (5, 10, 20), all normally distributed, and examined two scenarios where the covariates were independent or correlated. For the correlated case, the average correlation was either weak (p = 0.05) or moderate (p = 0.40). Because the results were similar, we provide a subset of the results where Figures 1-4 show the values of the average energy and Hellinger distance as a function of the four generalizability statistics. These are given for the 5 covariate case and Figures 5-8 provide the results for the 20 covariate case, both from the independent covariates framework. For five covariates, as the average energy and Hellinger distances increase, this is associated with weaker assessments of generalization (SMDs > 0.25). For both overlap and the B-index, stronger assessments of generalization (values above 0.80 given by the dashed lines) are associated with average values of less than 0.50 on both the energy and Hellinger distances. Between the two measures, the average energy distances appear to be smaller overall with strong assessments of generalization. Similarly, distances less than 0.50 are associated with smaller average SMDs in both propensity scores and covariates. These trends are also seen in the 20 covariates case. Because of this, the results suggest that a cutoff of 0.50 in the distance measures is potentially linked to weak/stronger assessments of generalization. Conclusion: Our preliminary simulation results suggest that energy and Hellinger distances greater than 0.50 are associated with weak assessments of generalization. Our study will continue investigating the conditions under which certain values of both distances lead to specific assessments of generalization.
Society for Research on Educational Effectiveness. 2040 Sheridan Road, Evanston, IL 60208. Tel: 202-495-0920; e-mail: contact@sree.org; Web site: https://www.sree.org/
Publication Type: Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: Society for Research on Educational Effectiveness (SREE)
Grant or Contract Numbers: N/A
Author Affiliations: N/A