NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Type
Reports - Evaluative13
Journal Articles9
Speeches/Meeting Papers2
Information Analyses1
Audience
Researchers3
Laws, Policies, & Programs
Assessments and Surveys
Program for International…1
What Works Clearinghouse Rating
Showing all 13 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Daniel Koretz – Journal of Educational and Behavioral Statistics, 2024
A critically important balance in educational measurement between practical concerns and matters of technique has atrophied in recent decades, and as a result, some important issues in the field have not been adequately addressed. I start with the work of E. F. Lindquist, who exemplified the balance that is now wanting. Lindquist was arguably the…
Descriptors: Educational Assessment, Evaluation Methods, Achievement Tests, Educational History
Peer reviewed Peer reviewed
Direct linkDirect link
Leslie Rutkowski; David Rutkowski – Journal of Creative Behavior, 2025
The Programme for International Student Assessment (PISA) introduced creative thinking as an innovative domain in 2022. This paper examines the unique methodological issues in international assessments and the implications of measuring creative thinking within PISA's framework, including stratified sampling, rotated form designs, and a distinct…
Descriptors: Creativity, Creative Thinking, Measurement, Sampling
Gongjun Xu; Tony Sit; Lan Wang; Chiung-Yu Huang – Grantee Submission, 2017
Biased sampling occurs frequently in economics, epidemiology, and medical studies either by design or due to data collecting mechanism. Failing to take into account the sampling bias usually leads to incorrect inference. We propose a unified estimation procedure and a computationally fast resampling method to make statistical inference for…
Descriptors: Sampling, Statistical Inference, Computation, Generalization
Peer reviewed Peer reviewed
Direct linkDirect link
Lu, Hongjing; Chen, Dawn; Holyoak, Keith J. – Psychological Review, 2012
How can humans acquire relational representations that enable analogical inference and other forms of high-level reasoning? Using comparative relations as a model domain, we explore the possibility that bottom-up learning mechanisms applied to objects coded as feature vectors can yield representations of relations sufficient to solve analogy…
Descriptors: Inferences, Thinking Skills, Comparative Analysis, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Maraun, Michael; Gabriel, Stephanie – Psychological Methods, 2010
In his article, "An Alternative to Null-Hypothesis Significance Tests," Killeen (2005) urged the discipline to abandon the practice of "p[subscript obs]"-based null hypothesis testing and to quantify the signal-to-noise characteristics of experimental outcomes with replication probabilities. He described the coefficient that he…
Descriptors: Hypothesis Testing, Statistical Inference, Probability, Statistical Significance
Peer reviewed Peer reviewed
Direct linkDirect link
Serlin, Ronald C. – Psychological Methods, 2010
The sense that replicability is an important aspect of empirical science led Killeen (2005a) to define "p[subscript rep]," the probability that a replication will result in an outcome in the same direction as that found in a current experiment. Since then, several authors have praised and criticized 'p[subscript rep]," culminating…
Descriptors: Epistemology, Effect Size, Replication (Evaluation), Measurement Techniques
Peer reviewed Peer reviewed
Campbell, Donald T. – Evaluation and Program Planning, 1996
Regression artifacts are a source of mistaken causal inference in inferences based on time-series data and from longitudinal studies. These artifacts are illustrated, and it is noted that their magnitude is computable (and distinguishable from genuine effects) if the autocorrelation patterns for various lags is known. (SLD)
Descriptors: Causal Models, Evaluation Methods, Longitudinal Studies, Regression (Statistics)
Yu, Chong-Ho – Online Submission, 2005
Many research-related classes in social sciences present probability as a unified approach based upon mathematical axioms, but neglect the diversity of various probability theories and their associated philosophical assumptions. Although currently the dominant statistical and probabilistic approach is the Fisherian tradition, the use of Fisherian…
Descriptors: Probability, Inferences, Social Sciences, Statistical Significance
Peer reviewed Peer reviewed
Savoy, Jacques – Information Processing & Management, 1997
Discussion of evaluation methodology in information retrieval focuses on the average precision over a set of fixed recall values in an effort to evaluate the retrieval effectiveness of a search algorithm. Highlights include a review of traditional evaluation methodology with examples; and a statistical inference methodology called bootstrap.…
Descriptors: Algorithms, Evaluation Methods, Information Retrieval, Mathematical Formulas
Peer reviewed Peer reviewed
Direct linkDirect link
McCaffrey, Daniel F.; Ridgeway, Greg; Morral, Andrew R. – Psychological Methods, 2004
Causal effect modeling with naturalistic rather than experimental data is challenging. In observational studies participants in different treatment conditions may also differ on pretreatment characteristics that influence outcomes. Propensity score methods can theoretically eliminate these confounds for all observed covariates, but accurate…
Descriptors: Substance Abuse, Causal Models, Adolescents, Statistical Analysis
Blumberg, Carol Joyce – 1989
A subset of Statistical Process Control (SPC) methodology known as Control Charting is introduced. SPC methodology is a collection of graphical and inferential statistics techniques used to study the progress of phenomena over time. The types of control charts covered are the null X (mean), R (Range), X (individual observations), MR (moving…
Descriptors: Charts, Data Analysis, Educational Research, Evaluation Methods
Levy, Roy; Mislevy, Robert J. – US Department of Education, 2004
The challenges of modeling students' performance in simulation-based assessments include accounting for multiple aspects of knowledge and skill that arise in different situations and the conditional dependencies among multiple aspects of performance in a complex assessment. This paper describes a Bayesian approach to modeling and estimating…
Descriptors: Probability, Markov Processes, Monte Carlo Methods, Bayesian Statistics
Peer reviewed Peer reviewed
Ottenbacher, Kenneth J. – Journal of Special Education, 1990
The agreement between visual analysis and the results of the split-middle method of trend estimation was examined using a set of 24 stimulus graphs and 30 raters. Results revealed poor agreement between the two methods, and low sensitivity, specificity, and predictive ability for visual analysis in relation to statistical inferences. (JDD)
Descriptors: Elementary Secondary Education, Estimation (Mathematics), Evaluation Methods, Graphs