NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers2
Laws, Policies, & Programs
Head Start1
What Works Clearinghouse Rating
Showing all 13 results Save | Export
Kelvin Terrell Pompey – ProQuest LLC, 2021
Many methods are used to measure interrater reliability for studies where each target receives ratings by a different set of judges. The purpose of this study is to explore the use of hierarchical modeling for estimating interrater reliability using the intraclass correlation coefficient. This study provides a description of how the ICC can be…
Descriptors: Interrater Reliability, Evaluation Methods, Test Reliability, Correlation
Xinran Li; Peng Ding – Grantee Submission, 2018
Frequentists' inference often delivers point estimators associated with confidence intervals or sets for parameters of interest. Constructing the confidence intervals or sets requires understanding the sampling distributions of the point estimators, which, in many but not all cases, are related to asymptotic Normal distributions ensured by central…
Descriptors: Correlation, Intervals, Sampling, Evaluation Methods
Ding Peng; Avi Feller; Luke Miratrix – Grantee Submission, 2016
Applied researchers are increasingly interested in whether and how treatment effects vary in randomized evaluations, especially variation not explained by observed covariates. We propose a model-free approach for testing for the presence of such unexplained variation. To use this randomization-based approach, we must address the fact that the…
Descriptors: Randomized Controlled Trials, Statistical Inference, Evaluation Methods, Testing
Gongjun Xu; Tony Sit; Lan Wang; Chiung-Yu Huang – Grantee Submission, 2017
Biased sampling occurs frequently in economics, epidemiology, and medical studies either by design or due to data collecting mechanism. Failing to take into account the sampling bias usually leads to incorrect inference. We propose a unified estimation procedure and a computationally fast resampling method to make statistical inference for…
Descriptors: Sampling, Statistical Inference, Computation, Generalization
Kim, YoungKoung; DeCarlo, Lawrence T. – College Board, 2016
Because of concerns about test security, different test forms are typically used across different testing occasions. As a result, equating is necessary in order to get scores from the different test forms that can be used interchangeably. In order to assure the quality of equating, multiple equating methods are often examined. Various equity…
Descriptors: Equated Scores, Evaluation Methods, Sampling, Statistical Inference
Peer reviewed Peer reviewed
Direct linkDirect link
Padilla, Miguel A.; Divers, Jasmin; Newton, Matthew – Applied Psychological Measurement, 2012
Three different bootstrap methods for estimating confidence intervals (CIs) for coefficient alpha were investigated. In addition, the bootstrap methods were compared with the most promising coefficient alpha CI estimation methods reported in the literature. The CI methods were assessed through a Monte Carlo simulation utilizing conditions…
Descriptors: Intervals, Monte Carlo Methods, Computation, Sampling
Peer reviewed Peer reviewed
Direct linkDirect link
Gu, Fei; Skorupski, William P.; Hoyle, Larry; Kingston, Neal M. – Applied Psychological Measurement, 2011
Ramsay-curve item response theory (RC-IRT) is a nonparametric procedure that estimates the latent trait using splines, and no distributional assumption about the latent trait is required. For item parameters of the two-parameter logistic (2-PL), three-parameter logistic (3-PL), and polytomous IRT models, RC-IRT can provide more accurate estimates…
Descriptors: Intervals, Item Response Theory, Models, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Lu, Hongjing; Chen, Dawn; Holyoak, Keith J. – Psychological Review, 2012
How can humans acquire relational representations that enable analogical inference and other forms of high-level reasoning? Using comparative relations as a model domain, we explore the possibility that bottom-up learning mechanisms applied to objects coded as feature vectors can yield representations of relations sufficient to solve analogy…
Descriptors: Inferences, Thinking Skills, Comparative Analysis, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Zientek, Linda Reichwein; Ozel, Z. Ebrar Yetkiner; Ozel, Serkan; Allen, Jeff – Career and Technical Education Research, 2012
Confidence intervals (CIs) and effect sizes are essential to encourage meta-analytic thinking and to accumulate research findings. CIs provide a range of plausible values for population parameters with a degree of confidence that the parameter is in that particular interval. CIs also give information about how precise the estimates are. Comparison…
Descriptors: Vocational Education, Effect Size, Intervals, Self Esteem
Peer reviewed Peer reviewed
Suen, Hoi K. – Topics in Early Childhood Special Education, 1992
This commentary on EC 603 695 argues that significance testing is a necessary but insufficient condition for positivistic research, that judgment-based assessment and single-subject research are not substitutes for significance testing, and that sampling fluctuation should be considered as one of numerous epistemological concerns in any…
Descriptors: Evaluation Methods, Evaluative Thinking, Research Design, Research Methodology
Peer reviewed Peer reviewed
Da Prato, Robert A. – Topics in Early Childhood Special Education, 1992
This paper argues that judgment-based assessment of data from multiply replicated single-subject or small-N studies should replace normative-based (p=less than 0.05) assessment of large-N research in the clinical sciences, and asserts that inferential statistics should be abandoned as a method of evaluating clinical research data. (Author/JDD)
Descriptors: Evaluation Methods, Evaluative Thinking, Norms, Research Design
Peer reviewed Peer reviewed
Direct linkDirect link
Reading, Chris – Statistics Education Research Journal, 2004
Variation is a key concept in the study of statistics and its understanding is a crucial aspect of most statistically related tasks. This study aimed to extend and apply a hierarchy for describing students' understanding of variation that was developed in a sampling context to the context of a natural event in which variation occurs. Students aged…
Descriptors: Weather, Classification, Secondary School Students, Student Evaluation
Lefebvre, Daniel J.; Suen, Hoi K. – 1990
An empirical investigation of methodological issues associated with evaluating treatment effect in single-subject research (SSR) designs is presented. This investigation: (1) conducted a generalizability (G) study to identify the sources of systematic and random measurement error (SRME); (2) used an analytic approach based on G theory to integrate…
Descriptors: Classroom Observation Techniques, Disabilities, Educational Research, Error of Measurement