ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	8

Descriptor

Evaluation Methods	10
Research Design	10
Statistical Inference	10
Research Methodology	5
Statistical Significance	5
Hypothesis Testing	4
Measurement Techniques	4
Evaluation Problems	3
Experiments	3
Misconceptions	3
Predictive Measurement	3
Probability	3
Regression (Statistics)	3
Replication (Evaluation)	3
Sampling	3
Statistical Analysis	3
Validity	3
Bayesian Statistics	2
Effect Size	2
Evaluative Thinking	2
Statistics	2
Accuracy	1
Achievement Tests	1
Behavioral Science Research	1
Behavioral Sciences	1
More ▼

Source

Psychological Methods	3
Topics in Early Childhood…	2
Educational Administration…	1
Grantee Submission	1
Journal of Creative Behavior	1
Journal of Research on…	1
Multivariate Behavioral…	1

Publication Type

Journal Articles	9
Reports - Research	5
Opinion Papers	3
Reports - Evaluative	2

Education Level

Higher Education	1
Postsecondary Education	1
Secondary Education	1
Two Year Colleges	1

Audience

Researchers

Location

Delaware

Laws, Policies, & Programs

Assessments and Surveys

Program for International…

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Methodological Reflections on PISA's Creative Thinking Assessment

Peer reviewed

Direct link

Leslie Rutkowski; David Rutkowski – Journal of Creative Behavior, 2025

The Programme for International Student Assessment (PISA) introduced creative thinking as an innovative domain in 2022. This paper examines the unique methodological issues in international assessments and the implications of measuring creative thinking within PISA's framework, including stratified sampling, rotated form designs, and a distinct…

Descriptors: Creativity, Creative Thinking, Measurement, Sampling

Ready for Causal Research: A National Evaluability Assessment of Career and Technical Education Programs (Final Report)

Peer reviewed
PDF on ERIC

Download full text

Direct link

Hughes, Katherine L.; Miller, Trey; Reese, Kelly – Grantee Submission, 2021

This report from the Career and Technical Education (CTE) Research Network Lead team provides final results from an evaluability assessment of CTE programs that feasibly could be evaluated using a rigorous experimental design. Evaluability assessments (also called feasibility studies) are used in education and other fields, such as international…

Descriptors: Program Evaluation, Vocational Education, Evaluation Methods, Educational Research

Estimating Causal Effects of Education Interventions Using a Two-Rating Regression Discontinuity Design: Lessons from a Simulation Study and an Application

Peer reviewed

Direct link

Porter, Kristin E.; Reardon, Sean F.; Unlu, Fatih; Bloom, Howard S.; Cimpian, Joseph R. – Journal of Research on Educational Effectiveness, 2017

A valuable extension of the single-rating regression discontinuity design (RDD) is a multiple-rating RDD (MRRDD). To date, four main methods have been used to estimate average treatment effects at the multiple treatment frontiers of an MRRDD: the "surface" method, the "frontier" method, the "binding-score" method, and…

Descriptors: Regression (Statistics), Intervention, Quasiexperimental Design, Simulation

"p[subscript rep]" Replicates: Comment Prompted by Iverson, Wagenmakers, and Lee (2010); Lecoutre, Lecoutre, and Poitevineau (2010); and Maraun and Gabriel (2010)

Peer reviewed

Direct link

Killeen, Peter R. – Psychological Methods, 2010

Lecoutre, Lecoutre, and Poitevineau (2010) have provided sophisticated grounding for "p[subscript rep]." Computing it precisely appears, fortunately, no more difficult than doing so approximately. Their analysis will help move predictive inference into the mainstream. Iverson, Wagenmakers, and Lee (2010) have also validated…

Descriptors: Replication (Evaluation), Measurement Techniques, Research Design, Research Methodology

The Case for Use of Simple Difference Scores to Test the Significance of Differences in Mean Rates of Change in Controlled Repeated Measurements Designs

Peer reviewed

Direct link

Overall, John E.; Tonidandel, Scott – Multivariate Behavioral Research, 2010

A previous Monte Carlo study examined the relative powers of several simple and more complex procedures for testing the significance of difference in mean rates of change in a controlled, longitudinal, treatment evaluation study. Results revealed that the relative powers depended on the correlation structure of the simulated repeated measurements.…

Descriptors: Monte Carlo Methods, Statistical Significance, Correlation, Depression (Psychology)

Killeen's Probability of Replication and Predictive Probabilities: How to Compute, Use, and Interpret Them

Peer reviewed

Direct link

Lecoutre, Bruno; Lecoutre, Marie-Paule; Poitevineau, Jacques – Psychological Methods, 2010

P. R. Killeen's (2005a) probability of replication ("p[subscript rep]") of an experimental result is the fiducial Bayesian predictive probability of finding a same-sign effect in a replication of an experiment. "p[subscript rep]" is now routinely reported in "Psychological Science" and has also begun to appear in…

Descriptors: Research Methodology, Guidelines, Probability, Computation

Regarding "p[subscript rep]": Comment Prompted by Iverson, Wagenmakers, and Lee (2010); Lecoutre, Lecoutre, and Poitevineau (2010); and Maraun and Gabriel (2010)

Peer reviewed

Direct link

Serlin, Ronald C. – Psychological Methods, 2010

The sense that replicability is an important aspect of empirical science led Killeen (2005a) to define "p[subscript rep]," the probability that a replication will result in an outcome in the same direction as that found in a current experiment. Since then, several authors have praised and criticized 'p[subscript rep]," culminating…

Descriptors: Epistemology, Effect Size, Replication (Evaluation), Measurement Techniques

Significance Testing; Necessary but Insufficient.

Peer reviewed

Suen, Hoi K. – Topics in Early Childhood Special Education, 1992

This commentary on EC 603 695 argues that significance testing is a necessary but insufficient condition for positivistic research, that judgment-based assessment and single-subject research are not substitutes for significance testing, and that sampling fluctuation should be considered as one of numerous epistemological concerns in any…

Descriptors: Evaluation Methods, Evaluative Thinking, Research Design, Research Methodology

A Call for Statistical Reform in EAQ

Peer reviewed

Direct link

Byrd, Jimmy K. – Educational Administration Quarterly, 2007

Purpose: The purpose of this study was to review research published by Educational Administration Quarterly (EAQ) during the past 10 years to determine if confidence intervals and effect sizes were being reported as recommended by the American Psychological Association (APA) Publication Manual. Research Design: The author examined 49 volumes of…

Descriptors: Research Design, Intervals, Statistical Inference, Effect Size

Large-Group Fantasies versus Single-Subject Science.

Peer reviewed

Da Prato, Robert A. – Topics in Early Childhood Special Education, 1992

This paper argues that judgment-based assessment of data from multiply replicated single-subject or small-N studies should replace normative-based (p=less than 0.05) assessment of large-N research in the clinical sciences, and asserts that inferential statistics should be abandoned as a method of evaluating clinical research data. (Author/JDD)

Descriptors: Evaluation Methods, Evaluative Thinking, Norms, Research Design

Bloom, Howard S.	1
Byrd, Jimmy K.	1
Cimpian, Joseph R.	1
Da Prato, Robert A.	1
David Rutkowski	1
Hughes, Katherine L.	1
Killeen, Peter R.	1
Lecoutre, Bruno	1
Lecoutre, Marie-Paule	1
Leslie Rutkowski	1
Miller, Trey	1
Overall, John E.	1
Poitevineau, Jacques	1
Porter, Kristin E.	1
Reardon, Sean F.	1
Reese, Kelly	1
Serlin, Ronald C.	1
Suen, Hoi K.	1
Tonidandel, Scott	1
Unlu, Fatih	1
More ▼