NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20251
Since 20243
Since 2021 (last 5 years)15
Since 2016 (last 10 years)36
Since 2006 (last 20 years)92
Laws, Policies, & Programs
Race to the Top1
What Works Clearinghouse Rating
Showing 1 to 15 of 92 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Jean-Paul Fox – Journal of Educational and Behavioral Statistics, 2025
Popular item response theory (IRT) models are considered complex, mainly due to the inclusion of a random factor variable (latent variable). The random factor variable represents the incidental parameter problem since the number of parameters increases when including data of new persons. Therefore, IRT models require a specific estimation method…
Descriptors: Sample Size, Item Response Theory, Accuracy, Bayesian Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Wendy Chan – Asia Pacific Education Review, 2024
As evidence from evaluation and experimental studies continue to influence decision and policymaking, applied researchers and practitioners require tools to derive valid and credible inferences. Over the past several decades, research in causal inference has progressed with the development and application of propensity scores. Since their…
Descriptors: Probability, Scores, Causal Models, Statistical Inference
Peer reviewed Peer reviewed
Direct linkDirect link
Daniel McNeish; Patrick D. Manapat – Structural Equation Modeling: A Multidisciplinary Journal, 2024
A recent review found that 11% of published factor models are hierarchical models with second-order factors. However, dedicated recommendations for evaluating hierarchical model fit have yet to emerge. Traditional benchmarks like RMSEA <0.06 or CFI >0.95 are often consulted, but they were never intended to generalize to hierarchical models.…
Descriptors: Factor Analysis, Goodness of Fit, Hierarchical Linear Modeling, Benchmarking
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Tan, Teck Kiang – Practical Assessment, Research & Evaluation, 2023
Researchers often have hypotheses concerning the state of affairs in the population from which they sampled their data to compare group means. The classical frequentist approach provides one way of carrying out hypothesis testing using ANOVA to state the null hypothesis that there is no difference in the means and proceed with multiple comparisons…
Descriptors: Comparative Analysis, Hypothesis Testing, Statistical Analysis, Guidelines
Peer reviewed Peer reviewed
Direct linkDirect link
Van Lissa, Caspar J.; van Erp, Sara; Clapper, Eli-Boaz – Research Synthesis Methods, 2023
When meta-analyzing heterogeneous bodies of literature, meta-regression can be used to account for potentially relevant between-studies differences. A key challenge is that the number of candidate moderators is often high relative to the number of studies. This introduces risks of overfitting, spurious results, and model non-convergence. To…
Descriptors: Bayesian Statistics, Regression (Statistics), Maximum Likelihood Statistics, Meta Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Wendy Chan; Jimin Oh; Katherine Wilson – Society for Research on Educational Effectiveness, 2022
Background: Over the past decade, research on the development and assessment of tools to improve the generalizability of experimental findings has grown extensively (Tipton & Olsen, 2018). However, many experimental studies in education are based on small samples, which may include 30-70 schools while inference populations to which…
Descriptors: Educational Research, Research Problems, Sample Size, Research Methodology
Peer reviewed Peer reviewed
Direct linkDirect link
Jaciw, Andrew P.; Unlu, Fatih; Nguyen, Thanh – American Journal of Evaluation, 2022
There is a burgeoning body of evidence on the average impacts of educational programs. Yet, for many local decision makers, because impacts can vary across sites, the question of whether a certain program will work in their particular district or school remains. This article addresses the question of the generalizability of large-scale average…
Descriptors: Program Effectiveness, Generalization, Outcome Measures, Institutional Characteristics
Peer reviewed Peer reviewed
Direct linkDirect link
Garman, Andrew N.; Erwin, Taylor S.; Garman, Tyler R.; Kim, Dae Hyun – Journal of Competency-Based Education, 2021
Background: Competency models provide useful frameworks for organizing learning and assessment programs, but their construction is both time intensive and subject to perceptual biases. Some aspects of model development may be particularly well-suited to automation, specifically natural language processing (NLP), which could also help make them…
Descriptors: Natural Language Processing, Automation, Guidelines, Leadership Effectiveness
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kevin Hirschi; Okim Kang – Language Teaching Research Quarterly, 2023
This paper extends the use of Generalizability Theory to the measurement of extemporaneous L2 speech through the lens of speech perception. Using six datasets of previous studies, it reports on "G studies"--a method of breaking down measurement variance--and "D studies"--a predictive study of the impact on reliability when…
Descriptors: Evaluators, Generalization, Evaluation Methods, Speech Communication
Peer reviewed Peer reviewed
Direct linkDirect link
Relaford-Doyle, Josephine; Núñez, Rafael – International Journal of Research in Undergraduate Mathematics Education, 2021
This paper describes a study that used a novel method to investigate conceptual difficulties with mathematical induction among two groups of undergraduate students: students who had received university-level instruction in formal mathematical induction, and students who had not been exposed to formal mathematical induction at the university level.…
Descriptors: Concept Formation, Mathematical Concepts, Difficulty Level, Undergraduate Students
Peer reviewed Peer reviewed
Direct linkDirect link
Rosenberg, Joshua M.; Krist, Christina – Journal of Science Education and Technology, 2021
Assessing students' participation in science practices presents several challenges, especially when aiming to differentiate meaningful (vs. rote) forms of participation. In this study, we sought to use machine learning (ML) for a novel purpose in science assessment: developing a construct map for students' "consideration of generality,"…
Descriptors: Artificial Intelligence, Educational Technology, Technology Uses in Education, Models
Peer reviewed Peer reviewed
Direct linkDirect link
McClure, Erica B.; Burt, Jonathan L. – Beyond Behavior, 2023
Functional communication training (FCT) is a strategy to address problem behavior for students with various disabilities that is supported by a broad evidence base. Despite this support, multiple factors continue to dissuade educators from utilizing FCT in their classrooms. This article outlines the process of developing and implementing FCT plans…
Descriptors: Behavior Problems, Students with Disabilities, Intervention, Evidence Based Practice
Peer reviewed Peer reviewed
Direct linkDirect link
Russell, Michael; Szendey, Olivia; Li, Zhushan – Educational Assessment, 2022
Recent research provides evidence that an intersectional approach to defining reference and focal groups results in a higher percentage of comparisons flagged for potential DIF. The study presented here examined the generalizability of this pattern across methods for examining DIF. While the level of DIF detection differed among the four methods…
Descriptors: Comparative Analysis, Item Analysis, Test Items, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Yesiltas, Gonca; Paek, Insu – Educational and Psychological Measurement, 2020
A log-linear model (LLM) is a well-known statistical method to examine the relationship among categorical variables. This study investigated the performance of LLM in detecting differential item functioning (DIF) for polytomously scored items via simulations where various sample sizes, ability mean differences (impact), and DIF types were…
Descriptors: Simulation, Sample Size, Item Analysis, Scores
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Khamboonruang, Apichat – rEFLections, 2022
Although much research has compared the functioning between analytic and holistic rating scales, little research has compared the functioning of binary rating scales with other types of rating scales. This quantitative study set out to preliminarily and comparatively validate binary and analytic rating scales intended for use in formative…
Descriptors: Writing Evaluation, Evaluation Methods, Second Language Learning, Second Language Instruction
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7