NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 15 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Wendy Chan; Jimin Oh; Chen Li; Jiexuan Huang; Yeran Tong – Society for Research on Educational Effectiveness, 2023
Background: The generalizability of a study's results continues to be at the forefront of concerns in evaluation research in education (Tipton & Olsen, 2018). Over the past decade, statisticians have developed methods, mainly based on propensity scores, to improve generalizations in the absence of random sampling (Stuart et al., 2011; Tipton,…
Descriptors: Generalizability Theory, Probability, Scores, Sampling
Peer reviewed Peer reviewed
Direct linkDirect link
van der Linden, Wim J. – Journal of Educational and Behavioral Statistics, 2022
The current literature on test equating generally defines it as the process necessary to obtain score comparability between different test forms. The definition is in contrast with Lord's foundational paper which viewed equating as the process required to obtain comparability of measurement scale between forms. The distinction between the notions…
Descriptors: Equated Scores, Test Items, Scores, Probability
Peer reviewed Peer reviewed
Direct linkDirect link
Chan, Wendy – American Journal of Evaluation, 2022
Over the past ten years, propensity score methods have made an important contribution to improving generalizations from studies that do not select samples randomly from a population of inference. However, these methods require assumptions and recent work has considered the role of bounding approaches that provide a range of treatment impact…
Descriptors: Probability, Scores, Scoring, Generalization
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Donoghue, John R.; McClellan, Catherine A.; Hess, Melinda R. – ETS Research Report Series, 2022
When constructed-response items are administered for a second time, it is necessary to evaluate whether the current Time B administration's raters have drifted from the scoring of the original administration at Time A. To study this, Time A papers are sampled and rescored by Time B scorers. Commonly the scores are compared using the proportion of…
Descriptors: Item Response Theory, Test Construction, Scoring, Testing
Kelli A. Bird; Benjamin L. Castleman; Zachary Mabel; Yifeng Song – Annenberg Institute for School Reform at Brown University, 2021
Colleges have increasingly turned to predictive analytics to target at-risk students for additional support. Most of the predictive analytic applications in higher education are proprietary, with private companies offering little transparency about their underlying models. We address this lack of transparency by systematically comparing two…
Descriptors: At Risk Students, Higher Education, Predictive Measurement, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Chan, Wendy – Journal of Educational and Behavioral Statistics, 2018
Policymakers have grown increasingly interested in how experimental results may generalize to a larger population. However, recently developed propensity score-based methods are limited by small sample sizes, where the experimental study is generalized to a population that is at least 20 times larger. This is particularly problematic for methods…
Descriptors: Computation, Generalization, Probability, Sample Size
Peer reviewed Peer reviewed
Direct linkDirect link
Chan, Wendy – Journal of Research on Educational Effectiveness, 2017
Recent methods to improve generalizations from nonrandom samples typically invoke assumptions such as the strong ignorability of sample selection, which is challenging to meet in practice. Although researchers acknowledge the difficulty in meeting this assumption, point estimates are still provided and used without considering alternative…
Descriptors: Generalization, Inferences, Probability, Educational Research
Imbens, Guido W.; Rubin, Donald B. – Cambridge University Press, 2015
Most questions in social and biomedical sciences are causal in nature: what would happen to individuals, or to groups, if part of their environment were changed? In this groundbreaking text, two world-renowned experts present statistical methods for studying such questions. This book starts with the notion of potential outcomes, each corresponding…
Descriptors: Causal Models, Statistical Inference, Statistics, Social Sciences
Itang'ata, Mukaria J. J. – ProQuest LLC, 2013
Often researchers face situations where comparative studies between two or more programs are necessary to make causal inferences for informed policy decision-making. Experimental designs employing randomization provide the strongest evidence for causal inferences. However, many pragmatic and ethical challenges may preclude the use of randomized…
Descriptors: Comparative Analysis, Probability, Statistical Bias, Monte Carlo Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Gugiu, Mihaiela R.; Gugiu, Paul C.; Baldus, Robert – Journal of MultiDisciplinary Evaluation, 2012
Background: Educational researchers have long espoused the virtues of writing with regard to student cognitive skills. However, research on the reliability of the grades assigned to written papers reveals a high degree of contradiction, with some researchers concluding that the grades assigned are very reliable whereas others suggesting that they…
Descriptors: Grades (Scholastic), Grading, Scoring Rubrics, Research Design
Conley, Quincy – ProQuest LLC, 2013
Statistics is taught at every level of education, yet teachers often have to assume their students have no knowledge of statistics and start from scratch each time they set out to teach statistics. The motivation for this experimental study comes from interest in exploring educational applications of augmented reality (AR) delivered via mobile…
Descriptors: Statistics, Mathematics Instruction, Simulated Environment, Computer Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Reutzel, D. Ray; Petscher, Yaacov; Spichtig, Alexandra N. – Journal of Educational Research, 2012
The authors' purpose was to explore the effects of a supplementary, guided, silent reading intervention with 80 struggling third-grade readers who were retained at grade level as a result of poor performance on the reading portion of a criterion referenced state assessment. The students were distributed in 11 elementary schools in a large, urban…
Descriptors: Academic Achievement, Achievement Tests, Control Groups, Reading Fluency
Peer reviewed Peer reviewed
Elster, Richard S.; Dunnette, Marvin D. – Educational and Psychological Measurement, 1971
Descriptors: Hypothesis Testing, Measurement Techniques, Probability, Sampling
Wilde, Elizabeth Ty; Hollister, Robinson – Institute for Research on Poverty, 2002
In this study we test the performance of some nonexperimental estimators of impacts applied to an educational intervention--reduction in class size--where achievement test scores were the outcome. We compare the nonexperimental estimates of the impacts to "true impact" estimates provided by a random-assignment design used to assess the…
Descriptors: Computation, Outcome Measures, Achievement Tests, Scores
Lord, Frederic M. – 1971
Some stochastic approximation procedures are considered in relation to the problem of choosing a sequence of test questions to accurately estimate a given examinee's standing on a psychological dimension. Illustrations are given evaluating certain procedures in a specific context. (Author/CK)
Descriptors: Academic Ability, Adaptive Testing, Computer Programs, Difficulty Level