Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 47 |
Since 2006 (last 20 years) | 121 |
Descriptor
Evaluation Methods | 228 |
Statistical Analysis | 228 |
Hypothesis Testing | 66 |
Testing | 56 |
Foreign Countries | 51 |
Student Evaluation | 50 |
Computer Assisted Testing | 46 |
Measurement Techniques | 37 |
Research Methodology | 35 |
Comparative Analysis | 31 |
Correlation | 30 |
More ▼ |
Source
Author
Porter, Kristin E. | 4 |
Bobbett, Gordon | 2 |
Booker, Kevin | 2 |
Bruch, Julie | 2 |
Burstein, Leigh | 2 |
French, Russell L. | 2 |
Gill, Brian | 2 |
Millsap, Roger E. | 2 |
Wilcox, Rand R. | 2 |
Zimmerman, Donald W. | 2 |
Zumbo, Bruno D. | 2 |
More ▼ |
Publication Type
Education Level
Location
United Kingdom | 6 |
Australia | 5 |
Japan | 4 |
Netherlands | 4 |
New Zealand | 4 |
Germany | 3 |
South Africa | 3 |
United Kingdom (England) | 3 |
California | 2 |
Canada | 2 |
Denmark | 2 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 3 |
Individuals with Disabilities… | 1 |
Occupational Safety and… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Ozsoy, Seyma Nur; Kilmen, Sevilay – International Journal of Assessment Tools in Education, 2023
In this study, Kernel test equating methods were compared under NEAT and NEC designs. In NEAT design, Kernel post-stratification and chain equating methods taking into account optimal and large bandwidths were compared. In the NEC design, gender and/or computer/tablet use was considered as a covariate, and Kernel test equating methods were…
Descriptors: Equated Scores, Testing, Test Items, Statistical Analysis
Tan, Teck Kiang – Practical Assessment, Research & Evaluation, 2023
Researchers often have hypotheses concerning the state of affairs in the population from which they sampled their data to compare group means. The classical frequentist approach provides one way of carrying out hypothesis testing using ANOVA to state the null hypothesis that there is no difference in the means and proceed with multiple comparisons…
Descriptors: Comparative Analysis, Hypothesis Testing, Statistical Analysis, Guidelines
Practices in Instrument Use and Development in "Chemistry Education Research and Practice" 2010-2021
Lazenby, Katherine; Tenney, Kristin; Marcroft, Tina A.; Komperda, Regis – Chemistry Education Research and Practice, 2023
Assessment instruments that generate quantitative data on attributes (cognitive, affective, behavioral, "etc.") of participants are commonly used in the chemistry education community to draw conclusions in research studies or inform practice. Recently, articles and editorials have stressed the importance of providing evidence for the…
Descriptors: Chemistry, Periodicals, Journal Articles, Science Education
Mavridis, A.; Tsiatsos, T. – Journal of Computer Assisted Learning, 2017
The aim of this study is to assess the impact of a 3D educational computer game on students' test anxiety and exam performance when used in evaluative situations as compared to the traditional method of examination. The participants of the study were students in tertiary education who were examined using game-based assessment and traditional…
Descriptors: Computer Games, Teaching Methods, Test Anxiety, Statistical Analysis
Deke, John; Finucane, Mariel; Thal, Daniel – National Center for Education Evaluation and Regional Assistance, 2022
BASIE is a framework for interpreting impact estimates from evaluations. It is an alternative to null hypothesis significance testing. This guide walks researchers through the key steps of applying BASIE, including selecting prior evidence, reporting impact estimates, interpreting impact estimates, and conducting sensitivity analyses. The guide…
Descriptors: Bayesian Statistics, Educational Research, Data Interpretation, Hypothesis Testing
Porter, Kristin E. – Society for Research on Educational Effectiveness, 2016
In recent years, there has been increasing focus on the issue of multiple hypotheses testing in education evaluation studies. In these studies, researchers are typically interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time or across multiple treatment groups. When…
Descriptors: Hypothesis Testing, Intervention, Error Patterns, Evaluation Methods
Haberman, Shelby J.; Lee, Yi-Hsuan – ETS Research Report Series, 2017
In investigations of unusual testing behavior, a common question is whether a specific pattern of responses occurs unusually often within a group of examinees. In many current tests, modern communication techniques can permit quite large numbers of examinees to share keys, or common response patterns, to the entire test. To address this issue,…
Descriptors: Student Evaluation, Testing, Item Response Theory, Maximum Likelihood Statistics
Yang, Shitao; Black, Ken – Teaching Statistics: An International Journal for Teachers, 2019
Summary Employing a Wald confidence interval to test hypotheses about population proportions could lead to an increase in Type I or Type II errors unless the hypothesized value, p0, is used in computing its standard error rather than the sample proportion. Whereas the Wald confidence interval to estimate a population proportion uses the sample…
Descriptors: Error Patterns, Evaluation Methods, Error of Measurement, Measurement Techniques
Dowling, Carey Bernini – International Journal for the Scholarship of Teaching and Learning, 2017
This study set out to replicate and extend research on students' reading compliance and examine the impact of daily quizzing methodology on students' reading compliance and retention. 98 students in two sections of Abnormal Psychology participated (mean age = 21.5, SD = 3.35; 72.4% Caucasian). Using a multiple baseline quasi-experimental design…
Descriptors: Undergraduate Students, Psychopathology, Evaluation Methods, Testing
Porter, Kristin E. – Journal of Research on Educational Effectiveness, 2018
Researchers are often interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time, or across multiple treatment groups. The resulting multiplicity of statistical hypothesis tests can lead to spurious findings of effects. Multiple testing procedures (MTPs) are statistical…
Descriptors: Statistical Analysis, Program Effectiveness, Intervention, Hypothesis Testing
Kiley, Margaret; Holbrook, Allyson; Lovat, Terence; Fairbairn, Hedy; Starfield, Sue; Paltridge, Brian – Australian Universities' Review, 2018
While there has been considerable research on doctoral examination there is little that examines the various roles of the oral component and what issues one might consider if introducing or revising that aspect of the thesis examination process. This matter is of particular importance in Australia where it is not usual to have an oral component as…
Descriptors: Foreign Countries, Doctoral Dissertations, Evaluation Methods, Verbal Tests
Golovachyova, Viktoriya N.; Menlibekova, Gulbakhyt Zh.; Abayeva, Nella F.; Ten, Tatyana L.; Kogaya, Galina D. – International Journal of Environmental and Science Education, 2016
Using computer-based monitoring systems that rely on tests could be the most effective way of knowledge evaluation. The problem of objective knowledge assessment by means of testing takes on a new dimension in the context of new paradigms in education. The analysis of the existing test methods enabled us to conclude that tests with selected…
Descriptors: Expertise, Computer Assisted Testing, Student Evaluation, Knowledge Level
Luce, Christine; Kirnan, Jean P. – Journal of the Scholarship of Teaching and Learning, 2016
Contradictory results have been reported regarding the accuracy of various methods used to assess student learning in higher education. The current study examined student learning outcomes across a multi-section and mult-iinstructor psychology research course with both indirect and direct assessments in a sample of 67 undergraduate students. The…
Descriptors: Undergraduate Students, Psychology, Methods Courses, Student Evaluation
Hicks, Tyler; Rodríguez-Campos, Liliana; Choi, Jeong Hoon – American Journal of Evaluation, 2018
To begin statistical analysis, Bayesians quantify their confidence in modeling hypotheses with priors. A prior describes the probability of a certain modeling hypothesis apart from the data. Bayesians should be able to defend their choice of prior to a skeptical audience. Collaboration between evaluators and stakeholders could make their choices…
Descriptors: Bayesian Statistics, Evaluation Methods, Statistical Analysis, Hypothesis Testing
Rajagopal, Prabha; Ravana, Sri Devi – Information Research: An International Electronic Journal, 2017
Introduction: The use of averaged topic-level scores can result in the loss of valuable data and can cause misinterpretation of the effectiveness of system performance. This study aims to use the scores of each document to evaluate document retrieval systems in a pairwise system evaluation. Method: The chosen evaluation metrics are document-level…
Descriptors: Information Retrieval, Documentation, Scores, Information Systems