Publication Date
| In 2026 | 0 |
| Since 2025 | 15 |
| Since 2022 (last 5 years) | 53 |
| Since 2017 (last 10 years) | 141 |
| Since 2007 (last 20 years) | 216 |
Descriptor
| Science Tests | 265 |
| Test Validity | 265 |
| Test Reliability | 146 |
| Foreign Countries | 109 |
| Test Construction | 107 |
| Test Items | 96 |
| Science Instruction | 66 |
| Scientific Concepts | 65 |
| Multiple Choice Tests | 56 |
| Scores | 46 |
| Physics | 41 |
| More ▼ | |
Source
Author
| Bao, Lei | 4 |
| Conoyer, Sarah J. | 4 |
| Koenig, Kathleen | 4 |
| Ford, Jeremy W. | 3 |
| Han, Jing | 3 |
| Hosp, John L. | 3 |
| Sachin Nedungadi | 3 |
| Xiao, Yang | 3 |
| Balta, Nuri | 2 |
| Barniol, Pablo | 2 |
| Berberoglu, Giray | 2 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 10 |
| Practitioners | 8 |
| Teachers | 4 |
| Policymakers | 1 |
Location
| Turkey | 20 |
| Indonesia | 17 |
| Germany | 9 |
| United States | 6 |
| Nebraska | 5 |
| Canada | 4 |
| Japan | 4 |
| Singapore | 4 |
| Switzerland | 4 |
| Australia | 3 |
| China | 3 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 3 |
| Elementary and Secondary… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Meryem Konu Kadirhanogullari; Esra Özay Köse – Science Insights Education Frontiers, 2025
This study aims to develop a valid and reliable achievement test in accordance with the content framework of the 9th-grade Biology Course Curriculum published within the scope of the Turkish Century Maarif Model on the subject of "Organic Matter". The screening method was used for this purpose. The sample of the study consists of 258…
Descriptors: Science Tests, Test Construction, Grade 9, Biology
Ntumi, Simon; Agbenyo, Sheilla; Bulala, Tapela – Shanlax International Journal of Education, 2023
There is no need or point to testing of knowledge, attributes, traits, behaviours or abilities of an individual if information obtained from the test is inaccurate. However, by and large, it seems the estimation of psychometric properties of test items in classroomshas been completely ignored otherwise dying slowly in most testing environments. In…
Descriptors: Psychometrics, Accuracy, Test Validity, Factor Analysis
Putica, Katarina B. – Research in Science Education, 2023
Previous studies noted the scantiness of diagnostic instruments for the assessment of students' understanding of fundamental biochemistry concepts. Consequently, within this study, a four-tier test for the examination of secondary school students' conceptual understanding of amino acids, proteins, and enzymes has been developed. Items in the test…
Descriptors: Test Construction, Test Validity, Secondary School Students, Science Tests
Raudlah Melinda Sidik; Ana Ratna Wulan; K. Kusnadi – Journal of Biological Education Indonesia (Jurnal Pendidikan Biologi Indonesia), 2025
The research developed and validated EKSAI (Epistemic Knowledge Science Assessment Instrument), an assessment tool for epistemic knowledge in science education. The background is that 21st-century challenges demand a transformation in science education, with a focus on understanding how scientific knowledge is developed and evaluated, which is…
Descriptors: Science Tests, Knowledge Level, Biology, Test Validity
Yuriko K. Sosa Paredes; Björn Andersson – Educational Assessment, Evaluation and Accountability, 2025
In international large-scale assessments, student performance comparisons across educational systems are frequently done to assess the state and development in different domains. These results often have a large impact on educational policy and on the perceptions of an educational system's performance. Early assessments, such as the First and…
Descriptors: Test Interpretation, International Assessment, Science Tests, Scores
Jun-ichiro Yasuda; Michael M. Hull; Naohiro Mae; Kentaro Kojima – Physical Review Physics Education Research, 2025
Although conceptual assessment tests are commonly administered at the beginning and end of a semester, this pre-post approach has inherent limitations. Specifically, education researchers and instructors have limited ability to observe the progression of students' conceptual understanding throughout the course. Furthermore, instructors are limited…
Descriptors: Computer Assisted Testing, Adaptive Testing, Science Tests, Scientific Concepts
David G. Schreurs; Jaclyn M. Trate; Shalini Srinivasan; Melonie A. Teichert; Cynthia J. Luxford; Jamie L. Schneider; Kristen L. Murphy – Chemistry Education Research and Practice, 2024
With the already widespread nature of multiple-choice assessments and the increasing popularity of answer-until-correct, it is important to have methods available for exploring the validity of these types of assessments as they are developed. This work analyzes a 20-question multiple choice assessment covering introductory undergraduate chemistry…
Descriptors: Multiple Choice Tests, Test Validity, Introductory Courses, Science Tests
E.?B. Merki; S.?I. Hofer; A. Vaterlaus; A. Lichtenberger – Physical Review Physics Education Research, 2025
When describing motion in physics, the selection of a frame of reference is crucial. The graph of a moving object can look quite different based on the frame of reference. In recent years, various tests have been developed to assess the interpretation of kinematic graphs, but none of these tests have specifically addressed differences in reference…
Descriptors: Graphs, Motion, Physics, Secondary School Students
Karoline A. Sachse; Sebastian Weirich; Nicole Mahler; Camilla Rjosk – International Journal of Testing, 2024
In order to ensure content validity by covering a broad range of content domains, the testing times of some educational large-scale assessments last up to a total of two hours or more. Performance decline over the course of taking the test has been extensively documented in the literature. It can occur due to increases in the numbers of: (a)…
Descriptors: Test Wiseness, Test Score Decline, Testing Problems, Foreign Countries
Conoyer, Sarah J.; Therrien, William J.; White, Kristen K. – Assessment for Effective Intervention, 2022
Meta-analysis was used to examine curriculum-based measurement in the content areas of social studies and science. Nineteen studies between the years of 1998 and 2020 were reviewed to determine overall mean correlation for criterion validity and examine alternate-form reliability and slope coefficients. An overall mean correlation of 0.59 was…
Descriptors: Curriculum Based Assessment, Test Validity, Test Reliability, Science Tests
Grace C. Tetschner; Sachin Nedungadi – Chemistry Education Research and Practice, 2025
Many undergraduate chemistry students hold alternate conceptions related to resonance--an important and fundamental topic of organic chemistry. To help address these alternate conceptions, an organic chemistry instructor could administer the resonance concept inventory (RCI), which is a multiple-choice assessment that was designed to identify…
Descriptors: Scientific Concepts, Concept Formation, Item Response Theory, Scores
Zhai, Xiaoming; Krajcik, Joseph; Pellegrino, James W. – Journal of Science Education and Technology, 2021
This study provides a solid validity inferential network to guide the development, interpretation, and use of machine learning-based next-generation science assessments (NGSAs). Given that machine learning (ML) has been broadly implemented in the automatic scoring of constructed responses, essays, simulations, educational games, and…
Descriptors: Artificial Intelligence, Science Tests, Test Validity, State Standards
Student, Sanford R.; Gong, Brian – Educational Measurement: Issues and Practice, 2022
We address two persistent challenges in large-scale assessments of the Next Generation Science Standards: (a) the validity of score interpretations that target the standards broadly and (b) how to structure claims for assessments of this complex domain. The NGSS pose a particular challenge for specifying claims about students that evidence from…
Descriptors: Science Tests, Test Validity, Test Items, Test Construction
Yalinkilic, Funda; Gul, Seyda – Science Insights Education Frontiers, 2023
The aim of this study is to develop a valid and reliable achievement test on the subject of 'Basic Compounds in the Structure of Living Things'. During the preparation of the draft form of the test, a 32 item-question pool was created by the researchers in the light of the relevant literature. Then, these questions were presented to expert opinion…
Descriptors: Test Construction, Science Achievement, Science Tests, Test Validity
Ruying Li; Gaofeng Li – International Journal of Science and Mathematics Education, 2025
Systems thinking (ST) is an essential competence for future life and biology learning. Appropriate assessment is critical for collecting sufficient information to develop ST in biology education. This research offers an ST framework based on a comprehensive understanding of biological systems, encompassing four skills across three complexity…
Descriptors: Test Construction, Test Validity, Science Tests, Cognitive Tests

Peer reviewed
Direct link
