Publication Date
In 2025 | 0 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 9 |
Since 2016 (last 10 years) | 93 |
Since 2006 (last 20 years) | 240 |
Descriptor
Statistical Analysis | 513 |
Testing | 513 |
Foreign Countries | 83 |
Comparative Analysis | 79 |
Scores | 72 |
Test Validity | 61 |
Test Reliability | 60 |
Language Tests | 58 |
Correlation | 56 |
Evaluation Methods | 56 |
Measurement Techniques | 51 |
More ▼ |
Source
Author
Lord, Frederic M. | 4 |
Algina, James | 3 |
Lix, Lisa M. | 3 |
Weiss, David J. | 3 |
Angoff, William H. | 2 |
Betz, Nancy E. | 2 |
Byrne, Barbara M. | 2 |
Cai, Li | 2 |
Chambers, Francine | 2 |
Chan, Jason C. K. | 2 |
Deering, Kathleen N. | 2 |
More ▼ |
Publication Type
Education Level
Higher Education | 86 |
Postsecondary Education | 52 |
Elementary Education | 25 |
Secondary Education | 19 |
High Schools | 15 |
Elementary Secondary Education | 13 |
Middle Schools | 13 |
Grade 5 | 11 |
Grade 3 | 8 |
Grade 4 | 8 |
Grade 6 | 6 |
More ▼ |
Audience
Practitioners | 6 |
Teachers | 4 |
Researchers | 3 |
Students | 1 |
Location
Turkey | 9 |
California | 8 |
Canada | 7 |
Japan | 7 |
Germany | 6 |
Texas | 5 |
United Kingdom | 5 |
Florida | 4 |
Netherlands | 4 |
New York | 4 |
North Carolina | 4 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
V. N. Vimal Rao; Jeffrey K. Bye; Sashank Varma – Cognitive Research: Principles and Implications, 2024
The 0.05 boundary within Null Hypothesis Statistical Testing (NHST) "has made a lot of people very angry and been widely regarded as a bad move" (to quote Douglas Adams). Here, we move past meta-scientific arguments and ask an empirical question: What is the psychological standing of the 0.05 boundary for statistical significance? We…
Descriptors: Psychological Patterns, Statistical Analysis, Testing, Statistical Significance
Ozsoy, Seyma Nur; Kilmen, Sevilay – International Journal of Assessment Tools in Education, 2023
In this study, Kernel test equating methods were compared under NEAT and NEC designs. In NEAT design, Kernel post-stratification and chain equating methods taking into account optimal and large bandwidths were compared. In the NEC design, gender and/or computer/tablet use was considered as a covariate, and Kernel test equating methods were…
Descriptors: Equated Scores, Testing, Test Items, Statistical Analysis
The Use of Theory of Linear Mixed-Effects Models to Detect Fraudulent Erasures at an Aggregate Level
Peng, Luyao; Sinharay, Sandip – Educational and Psychological Measurement, 2022
Wollack et al. (2015) suggested the erasure detection index (EDI) for detecting fraudulent erasures for individual examinees. Wollack and Eckerly (2017) and Sinharay (2018) extended the index of Wollack et al. (2015) to suggest three EDIs for detecting fraudulent erasures at the aggregate or group level. This article follows up on the research of…
Descriptors: Cheating, Identification, Statistical Analysis, Testing
Rebeckah K. Fussell; Emily M. Stump; N. G. Holmes – Physical Review Physics Education Research, 2024
Physics education researchers are interested in using the tools of machine learning and natural language processing to make quantitative claims from natural language and text data, such as open-ended responses to survey questions. The aspiration is that this form of machine coding may be more efficient and consistent than human coding, allowing…
Descriptors: Physics, Educational Researchers, Artificial Intelligence, Natural Language Processing
Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024
The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…
Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests
Puhan, Gautam; Kim, Sooyeon – Journal of Educational Measurement, 2022
As a result of the COVID-19 pandemic, at-home testing has become a popular delivery mode in many testing programs. When programs offer at-home testing to expand their service, the score comparability between test takers testing remotely and those testing in a test center is critical. This article summarizes statistical procedures that could be…
Descriptors: Scores, Scoring, Comparative Analysis, Testing
Lozano, José H.; Revuelta, Javier – Applied Measurement in Education, 2021
The present study proposes a Bayesian approach for estimating and testing the operation-specific learning model, a variant of the linear logistic test model that allows for the measurement of the learning that occurs during a test as a result of the repeated use of the operations involved in the items. The advantages of using a Bayesian framework…
Descriptors: Bayesian Statistics, Computation, Learning, Testing
Evers, Arne; McCormick, Carina M.; Hawley, Leslie R.; Muñiz, José; Balboni, Giulia; Bartram, Dave; Boben, Dusica; Egeland, Jens; El-Hassan, Karma; Fernández-Hermida, José R.; Fine, Saul; Frans, Örjan; Gintiliené, Grazina; Hagemeister, Carmen; Halama, Peter; Iliescu, Dragos; Jaworowska, Aleksandra; Jiménez, Paul; Manthouli, Marina; Matesic, Krunoslav; Michaelsen, Lars; Mogaji, Andrew; Morley-Kirk, James; Rózsa, Sándor; Rowlands, Lorraine; Schittekatte, Mark; Sümer, H. Canan; Suwartono, Tono; Urbánek, Tomáš; Wechsler, Solange; Zelenevska, Tamara; Zanev, Svetoslav; Zhang, Jianxin – International Journal of Testing, 2017
On behalf of the International Test Commission and the European Federation of Psychologists' Associations a world-wide survey on the opinions of professional psychologists on testing practices was carried out. The main objective of this study was to collect data for a better understanding of the state of psychological testing worldwide. These data…
Descriptors: Testing, Attitudes, Surveys, Psychologists
Practices in Instrument Use and Development in "Chemistry Education Research and Practice" 2010-2021
Lazenby, Katherine; Tenney, Kristin; Marcroft, Tina A.; Komperda, Regis – Chemistry Education Research and Practice, 2023
Assessment instruments that generate quantitative data on attributes (cognitive, affective, behavioral, "etc.") of participants are commonly used in the chemistry education community to draw conclusions in research studies or inform practice. Recently, articles and editorials have stressed the importance of providing evidence for the…
Descriptors: Chemistry, Periodicals, Journal Articles, Science Education
Mavridis, A.; Tsiatsos, T. – Journal of Computer Assisted Learning, 2017
The aim of this study is to assess the impact of a 3D educational computer game on students' test anxiety and exam performance when used in evaluative situations as compared to the traditional method of examination. The participants of the study were students in tertiary education who were examined using game-based assessment and traditional…
Descriptors: Computer Games, Teaching Methods, Test Anxiety, Statistical Analysis
Keller, Bryan – Journal of Educational and Behavioral Statistics, 2020
Widespread availability of rich educational databases facilitates the use of conditioning strategies to estimate causal effects with nonexperimental data. With dozens, hundreds, or more potential predictors, variable selection can be useful for practical reasons related to communicating results and for statistical reasons related to improving the…
Descriptors: Nonparametric Statistics, Computation, Testing, Causal Models
Luke G. Eglington; Philip I. Pavlik – Grantee Submission, 2020
Decades of research has shown that spacing practice trials over time can improve later memory, but there are few concrete recommendations concerning how to optimally space practice. We show that existing recommendations are inherently suboptimal due to their insensitivity to time costs and individual- and item-level differences. We introduce an…
Descriptors: Scheduling, Drills (Practice), Memory, Testing
Luke G. Eglington; Philip I. Pavlik Jr. – npj Science of Learning, 2020
Decades of research has shown that spacing practice trials over time can improve later memory, but there are few concrete recommendations concerning how to optimally space practice. We show that existing recommendations are inherently suboptimal due to their insensitivity to time costs and individual- and item-level differences. We introduce an…
Descriptors: Scheduling, Drills (Practice), Memory, Testing
Porter, Kristin E. – Society for Research on Educational Effectiveness, 2016
In recent years, there has been increasing focus on the issue of multiple hypotheses testing in education evaluation studies. In these studies, researchers are typically interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time or across multiple treatment groups. When…
Descriptors: Hypothesis Testing, Intervention, Error Patterns, Evaluation Methods
Haberman, Shelby J.; Lee, Yi-Hsuan – ETS Research Report Series, 2017
In investigations of unusual testing behavior, a common question is whether a specific pattern of responses occurs unusually often within a group of examinees. In many current tests, modern communication techniques can permit quite large numbers of examinees to share keys, or common response patterns, to the entire test. To address this issue,…
Descriptors: Student Evaluation, Testing, Item Response Theory, Maximum Likelihood Statistics