NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 31 to 45 of 3,988 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Cappaert, Kevin J.; Wen, Yao; Chang, Yu-Feng – Measurement: Interdisciplinary Research and Perspectives, 2018
Events such as curriculum changes or practice effects can lead to item parameter drift (IPD) in computer adaptive testing (CAT). The current investigation introduced a point- and weight-adjusted D[superscript 2] method for IPD detection for use in a CAT environment when items are suspected of drifting across test administrations. Type I error and…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Items, Identification
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Frank Wang – Numeracy, 2021
In late November 2020, there was a flurry of media coverage of two companies' claims of 95% efficacy rates of newly developed COVID-19 vaccines, but information about the confidence interval was not reported. This paper presents a way of teaching the concept of hypothesis testing and the construction of confidence intervals using numbers announced…
Descriptors: COVID-19, Pandemics, Immunization Programs, Hypothesis Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Puhan, Gautam; Kim, Sooyeon – Journal of Educational Measurement, 2022
As a result of the COVID-19 pandemic, at-home testing has become a popular delivery mode in many testing programs. When programs offer at-home testing to expand their service, the score comparability between test takers testing remotely and those testing in a test center is critical. This article summarizes statistical procedures that could be…
Descriptors: Scores, Scoring, Comparative Analysis, Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Nikolakopoulos, Stavros – Research Synthesis Methods, 2020
In narrative synthesis of evidence, it can be the case that the only quantitative measures available concerning the efficacy of an intervention is the direction of the effect, that is, whether it is positive or negative. In such situations, the sign test has been proposed in the literature and in recent Cochrane guidelines as a way to test whether…
Descriptors: Synthesis, Evidence, Statistical Analysis, Nonparametric Statistics
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Metsämuuronen, Jari – International Journal of Educational Methodology, 2020
Pearson product-moment correlation coefficient between item g and test score X, known as item-test or item-total correlation ("Rit"), and item-rest correlation ("Rir") are two of the most used classical estimators for item discrimination power (IDP). Both "Rit" and "Rir" underestimate IDP caused by the…
Descriptors: Correlation, Test Items, Scores, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Brauer, Jonathan R.; Day, Jacob C.; Hammond, Brittany M. – Sociological Methods & Research, 2021
This article presents two alternative methods to null hypothesis significance testing (NHST) for improving inferences from underpowered research designs. Post hoc design analysis (PHDA) assesses whether an NHST analysis generating null findings might otherwise have had sufficient power to detect effects of plausible magnitudes. Bayesian analysis…
Descriptors: Hypothesis Testing, Statistical Analysis, Bayesian Statistics, Statistical Significance
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Xi; Liu, Yang; Robin, Frederic; Guo, Hongwen – International Journal of Testing, 2019
In an on-demand testing program, some items are repeatedly used across test administrations. This poses a risk to test security. In this study, we considered a scenario wherein a test was divided into two subsets: one consisting of secure items and the other consisting of possibly compromised items. In a simulation study of multistage adaptive…
Descriptors: Identification, Methods, Test Items, Cheating
Peer reviewed Peer reviewed
Direct linkDirect link
Trafimow, David – International Journal of Social Research Methodology, 2019
Although the null hypothesis significance testing procedure is problematic, many still favor the use of "p"-values as indicating the state of evidence against the model used to generate the "p"-value. From this perspective, "p"-values benefit science; or would benefit science if used correctly. In contrast, the novel…
Descriptors: Hypothesis Testing, Models, Taxonomy, Probability
Peer reviewed Peer reviewed
Direct linkDirect link
Lozano, José H.; Revuelta, Javier – Applied Measurement in Education, 2021
The present study proposes a Bayesian approach for estimating and testing the operation-specific learning model, a variant of the linear logistic test model that allows for the measurement of the learning that occurs during a test as a result of the repeated use of the operations involved in the items. The advantages of using a Bayesian framework…
Descriptors: Bayesian Statistics, Computation, Learning, Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Brydges, Christopher R.; Gaeta, Laura – Journal of Speech, Language, and Hearing Research, 2019
Purpose: Null hypothesis significance testing is commonly used in audiology research to determine the presence of an effect. Knowledge of study outcomes, including nonsignificant findings, is important for evidence-based practice. Nonsignificant "p" values obtained from null hypothesis significance testing cannot differentiate between…
Descriptors: Bayesian Statistics, Audiology, Hypothesis Testing, Statistical Significance
Sinharay, Sandip; Johnson, Matthew S. – Grantee Submission, 2019
According to Wollack and Schoenig (2018), score differencing is one of six types of statistical methods used to detect test fraud. In this paper, we suggested the use of Bayes factors (e.g., Kass & Raftery, 1995) for score differencing. A simulation study shows that the suggested approach performs slightly better than an existing frequentist…
Descriptors: Cheating, Deception, Statistical Analysis, Bayesian Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Lazenby, Katherine; Tenney, Kristin; Marcroft, Tina A.; Komperda, Regis – Chemistry Education Research and Practice, 2023
Assessment instruments that generate quantitative data on attributes (cognitive, affective, behavioral, "etc.") of participants are commonly used in the chemistry education community to draw conclusions in research studies or inform practice. Recently, articles and editorials have stressed the importance of providing evidence for the…
Descriptors: Chemistry, Periodicals, Journal Articles, Science Education
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Jane E. Miller – Numeracy, 2023
Students often believe that statistical significance is the only determinant of whether a quantitative result is "important." In this paper, I review traditional null hypothesis statistical testing to identify what questions inferential statistics can and cannot answer, including statistical significance, effect size and direction,…
Descriptors: Statistical Significance, Holistic Approach, Statistical Inference, Effect Size
Peer reviewed Peer reviewed
Direct linkDirect link
Kristin Porter; Luke Miratrix; Kristen Hunter – Society for Research on Educational Effectiveness, 2021
Background: Researchers are often interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time, or across multiple treatment groups. The resulting multiplicity of statistical hypothesis tests can lead to spurious findings of effects. Multiple testing procedures (MTPs)…
Descriptors: Statistical Analysis, Hypothesis Testing, Computer Software, Randomized Controlled Trials
Peer reviewed Peer reviewed
Direct linkDirect link
Adam Sales – Society for Research on Educational Effectiveness, 2021
Education researchers frequently have to choose between statistical models for their data, and in many cases the candidate models or parameters can be listed in a sequence, m=1,...,M, from less preferable choices to more. For instance, in choosing a bandwidth for regression discontinuity designs, researchers would favor the largest possible…
Descriptors: Educational Research, Statistical Analysis, Research Design, Decision Making
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  266