ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	12

Descriptor

Evaluation Methods	14
Probability	14
Statistical Inference	14
Hypothesis Testing	8
Bayesian Statistics	7
Measurement Techniques	7
Evaluation Problems	6
Experiments	6
Misconceptions	6
Replication (Evaluation)	6
Validity	6
Predictive Measurement	5
Research Methodology	5
Effect Size	4
Statistical Significance	4
Computation	3
Models	3
Research Design	3
Scores	3
Statistical Distributions	3
Causal Models	2
Comparative Analysis	2
Error of Measurement	2
Evidence	2
Foreign Countries	2
More ▼

Source

Psychological Methods	6
Asia Pacific Education Review	1
Educational Psychologist	1
Educational and Psychological…	1
Harvard Educational Review	1
National Center for Education…	1
Online Submission	1
Sociological Methods &…	1
US Department of Education	1

Publication Type

Journal Articles	11
Reports - Research	5
Reports - Evaluative	4
Opinion Papers	3
Reports - Descriptive	2
Guides - Non-Classroom	1
Information Analyses	1
Speeches/Meeting Papers	1

Education Level

Grade 9	1
High Schools	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Researchers

Location

Germany	1
United Kingdom (England)	1
United Kingdom (Scotland)	1
United Kingdom (Wales)	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Propensity Score Methods for Causal Inference and Generalization

Peer reviewed

Direct link

Wendy Chan – Asia Pacific Education Review, 2024

As evidence from evaluation and experimental studies continue to influence decision and policymaking, applied researchers and practitioners require tools to derive valid and credible inferences. Over the past several decades, research in causal inference has progressed with the development and application of propensity scores. Since their…

Descriptors: Probability, Scores, Causal Models, Statistical Inference

A Comparison of Three Popular Methods for Handling Missing Data: Complete-Case Analysis, Inverse Probability Weighting, and Multiple Imputation

Peer reviewed

Direct link

Roderick J. Little; James R. Carpenter; Katherine J. Lee – Sociological Methods & Research, 2024

Missing data are a pervasive problem in data analysis. Three common methods for addressing the problem are (a) complete-case analysis, where only units that are complete on the variables in an analysis are included; (b) weighting, where the complete cases are weighted by the inverse of an estimate of the probability of being complete; and (c)…

Descriptors: Foreign Countries, Probability, Robustness (Statistics), Responses

The BASIE (BAyeSian Interpretation of Estimates) Framework for Interpreting Findings from Impact Evaluations: A Practical Guide for Education Researchers. Toolkit. NCEE 2022-005

Peer reviewed
PDF on ERIC

Download full text

Deke, John; Finucane, Mariel; Thal, Daniel – National Center for Education Evaluation and Regional Assistance, 2022

BASIE is a framework for interpreting impact estimates from evaluations. It is an alternative to null hypothesis significance testing. This guide walks researchers through the key steps of applying BASIE, including selecting prior evidence, reporting impact estimates, interpreting impact estimates, and conducting sensitivity analyses. The guide…

Descriptors: Bayesian Statistics, Educational Research, Data Interpretation, Hypothesis Testing

Quasi-Experimental Designs for Causal Inference

Peer reviewed

Direct link

Kim, Yongnam; Steiner, Peter – Educational Psychologist, 2016

When randomized experiments are infeasible, quasi-experimental designs can be exploited to evaluate causal treatment effects. The strongest quasi-experimental designs for causal inference are regression discontinuity designs, instrumental variable designs, matching and propensity score designs, and comparative interrupted time series designs. This…

Descriptors: Quasiexperimental Design, Causal Models, Statistical Inference, Randomized Controlled Trials

Rethinking Teacher Evaluation: A Conversation about Statistical Inferences and Value-Added Models

Peer reviewed

Direct link

Callister Everson, Kimberlee; Feinauer, Erika; Sudweeks, Richard R. – Harvard Educational Review, 2013

In this article, the authors provide a methodological critique of the current standard of value-added modeling forwarded in educational policy contexts as a means of measuring teacher effectiveness. Conventional value-added estimates of teacher quality are attempts to determine to what degree a teacher would theoretically contribute, on average,…

Descriptors: Teacher Evaluation, Teacher Effectiveness, Evaluation Methods, Accountability

Taking the Missing Propensity into Account When Estimating Competence Scores: Evaluation of Item Response Theory Models for Nonignorable Omissions

Peer reviewed

Direct link

Köhler, Carmen; Pohl, Steffi; Carstensen, Claus H. – Educational and Psychological Measurement, 2015

When competence tests are administered, subjects frequently omit items. These missing responses pose a threat to correctly estimating the proficiency level. Newer model-based approaches aim to take nonignorable missing data processes into account by incorporating a latent missing propensity into the measurement model. Two assumptions are typically…

Descriptors: Competence, Tests, Evaluation Methods, Adults

Killeen's (2005) "p[subscript rep]" Coefficient: Logical and Mathematical Problems

Peer reviewed

Direct link

Maraun, Michael; Gabriel, Stephanie – Psychological Methods, 2010

In his article, "An Alternative to Null-Hypothesis Significance Tests," Killeen (2005) urged the discipline to abandon the practice of "p[subscript obs]"-based null hypothesis testing and to quantify the signal-to-noise characteristics of experimental outcomes with replication probabilities. He described the coefficient that he…

Descriptors: Hypothesis Testing, Statistical Inference, Probability, Statistical Significance

"p[subscript rep]" Replicates: Comment Prompted by Iverson, Wagenmakers, and Lee (2010); Lecoutre, Lecoutre, and Poitevineau (2010); and Maraun and Gabriel (2010)

Peer reviewed

Direct link

Killeen, Peter R. – Psychological Methods, 2010

Lecoutre, Lecoutre, and Poitevineau (2010) have provided sophisticated grounding for "p[subscript rep]." Computing it precisely appears, fortunately, no more difficult than doing so approximately. Their analysis will help move predictive inference into the mainstream. Iverson, Wagenmakers, and Lee (2010) have also validated…

Descriptors: Replication (Evaluation), Measurement Techniques, Research Design, Research Methodology

Killeen's Probability of Replication and Predictive Probabilities: How to Compute, Use, and Interpret Them

Peer reviewed

Direct link

Lecoutre, Bruno; Lecoutre, Marie-Paule; Poitevineau, Jacques – Psychological Methods, 2010

P. R. Killeen's (2005a) probability of replication ("p[subscript rep]") of an experimental result is the fiducial Bayesian predictive probability of finding a same-sign effect in a replication of an experiment. "p[subscript rep]" is now routinely reported in "Psychological Science" and has also begun to appear in…

Descriptors: Research Methodology, Guidelines, Probability, Computation

A Model-Averaging Approach to Replication : The Case of "p[subscript rep]"

Peer reviewed

Direct link

Iverson, Geoffrey J.; Wagenmakers, Eric-Jan; Lee, Michael D. – Psychological Methods, 2010

The purpose of the recently proposed "p[subscript rep]" statistic is to estimate the probability of concurrence, that is, the probability that a replicate experiment yields an effect of the same sign (Killeen, 2005a). The influential journal "Psychological Science" endorses "p[subscript rep]" and recommends its use…

Descriptors: Effect Size, Evaluation Methods, Probability, Experiments

Regarding "p[subscript rep]": Comment Prompted by Iverson, Wagenmakers, and Lee (2010); Lecoutre, Lecoutre, and Poitevineau (2010); and Maraun and Gabriel (2010)

Peer reviewed

Direct link

Serlin, Ronald C. – Psychological Methods, 2010

The sense that replicability is an important aspect of empirical science led Killeen (2005a) to define "p[subscript rep]," the probability that a replication will result in an outcome in the same direction as that found in a current experiment. Since then, several authors have praised and criticized 'p[subscript rep]," culminating…

Descriptors: Epistemology, Effect Size, Replication (Evaluation), Measurement Techniques

Replication, "p[subscript rep]," and Confidence Intervals: Comment Prompted by Iverson, Wagenmakers, and Lee (2010); Lecoutre, Lecoutre, and Poitevineau (2010); and Maraun and Gabriel (2010)

Peer reviewed

Direct link

Cumming, Geoff – Psychological Methods, 2010

This comment offers three descriptions of "p[subscript rep]" that start with a frequentist account of confidence intervals, draw on R. A. Fisher's fiducial argument, and do not make Bayesian assumptions. Links are described among "p[subscript rep]," "p" values, and the probability a confidence interval will capture…

Descriptors: Replication (Evaluation), Measurement Techniques, Research Methodology, Validity

Balkanization and Unification of Probabilistic Inferences

Download full text

Yu, Chong-Ho – Online Submission, 2005

Many research-related classes in social sciences present probability as a unified approach based upon mathematical axioms, but neglect the diversity of various probability theories and their associated philosophical assumptions. Although currently the dominant statistical and probabilistic approach is the Fisherian tradition, the use of Fisherian…

Descriptors: Probability, Inferences, Social Sciences, Statistical Significance

Specifying and Refining a Measurement Model for a Simulation-Based Assessment. CSE Report 619.

Download full text

Levy, Roy; Mislevy, Robert J. – US Department of Education, 2004

The challenges of modeling students' performance in simulation-based assessments include accounting for multiple aspects of knowledge and skill that arise in different situations and the conditional dependencies among multiple aspects of performance in a complex assessment. This paper describes a Bayesian approach to modeling and estimating…

Descriptors: Probability, Markov Processes, Monte Carlo Methods, Bayesian Statistics

Callister Everson, Kimberlee	1
Carstensen, Claus H.	1
Cumming, Geoff	1
Deke, John	1
Feinauer, Erika	1
Finucane, Mariel	1
Gabriel, Stephanie	1
Iverson, Geoffrey J.	1
James R. Carpenter	1
Katherine J. Lee	1
Killeen, Peter R.	1
Kim, Yongnam	1
Köhler, Carmen	1
Lecoutre, Bruno	1
Lecoutre, Marie-Paule	1
Lee, Michael D.	1
Levy, Roy	1
Maraun, Michael	1
Mislevy, Robert J.	1
Pohl, Steffi	1
Poitevineau, Jacques	1
Roderick J. Little	1
Serlin, Ronald C.	1
Steiner, Peter	1
Sudweeks, Richard R.	1
More ▼