Publication Date
| In 2026 | 0 |
| Since 2025 | 25 |
| Since 2022 (last 5 years) | 117 |
| Since 2017 (last 10 years) | 294 |
| Since 2007 (last 20 years) | 448 |
Descriptor
Source
Author
| Murphy, Kristen L. | 11 |
| Liu, Ou Lydia | 9 |
| Holme, Thomas A. | 7 |
| Raker, Jeffrey R. | 7 |
| Linn, Marcia C. | 6 |
| Lee, Hee-Sun | 4 |
| Noble, Tracy | 4 |
| Reed, Jessica J. | 4 |
| Sireci, Stephen G. | 4 |
| Smith, Trevor I. | 4 |
| Solano-Flores, Guillermo | 4 |
| More ▼ | |
Publication Type
Education Level
Location
| Canada | 29 |
| Australia | 28 |
| Turkey | 22 |
| Indonesia | 17 |
| Germany | 11 |
| Singapore | 10 |
| United Kingdom (England) | 10 |
| United States | 10 |
| South Africa | 9 |
| California | 8 |
| Massachusetts | 8 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 5 |
| Individuals with Disabilities… | 1 |
| Individuals with Disabilities… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
| Does not meet standards | 1 |
Hui Jin; Cynthia Lima; Limin Wang – Educational Measurement: Issues and Practice, 2025
Although AI transformer models have demonstrated notable capability in automated scoring, it is difficult to examine how and why these models fall short in scoring some responses. This study investigated how transformer models' language processing and quantification processes can be leveraged to enhance the accuracy of automated scoring. Automated…
Descriptors: Automation, Scoring, Artificial Intelligence, Accuracy
Neset Demirci – Turkish Online Journal of Educational Technology - TOJET, 2025
In this study, the performance of artificial intelligence chatbots--OpenAI's ChatGPT, Google Gemini, and Microsoft's Copilot--was evaluated and compared based on their responses to questions from the Turkish Higher Education Entrance Physics Examination over the past three years. Analysis of the chatbots' responses to TYT Physics questions showed…
Descriptors: Artificial Intelligence, College Entrance Examinations, Physics, Science Tests
Daibao Guo; Katherine Landau Wright; Lianne Josbacher; Eun Hye Son – Elementary School Journal, 2025
Limited research has explored the use of visual displays (ViDis) in science tests, making it challenging to know how these tests align with classroom instruction and what skills students need to be successful on these tests. Therefore, the current study aims to describe the use of ViDis in upper elementary grade standardized science tests. We…
Descriptors: Standardized Tests, Science Tests, Elementary Education, Science Education
Gerd Kortemeyer; Marina Babayeva; Giulia Polverini; Ralf Widenhorn; Bor Gregorcic – Physical Review Physics Education Research, 2025
We investigate the multilingual and multimodal performance of a large language model-based artificial intelligence (AI) system, GPT-4o, using a diverse set of physics concept inventories spanning multiple languages and subject categories. The inventories, sourced from the PhysPort website, cover classical physics topics such as mechanics,…
Descriptors: Artificial Intelligence, Physics, Science Tests, Scientific Concepts
Su, Kun; Henson, Robert A. – Journal of Educational and Behavioral Statistics, 2023
This article provides a process to carefully evaluate the suitability of a content domain for which diagnostic classification models (DCMs) could be applicable and then optimized steps for constructing a test blueprint for applying DCMs and a real-life example illustrating this process. The content domains were carefully evaluated using a set of…
Descriptors: Classification, Models, Science Tests, Physics
Ntumi, Simon; Agbenyo, Sheilla; Bulala, Tapela – Shanlax International Journal of Education, 2023
There is no need or point to testing of knowledge, attributes, traits, behaviours or abilities of an individual if information obtained from the test is inaccurate. However, by and large, it seems the estimation of psychometric properties of test items in classroomshas been completely ignored otherwise dying slowly in most testing environments. In…
Descriptors: Psychometrics, Accuracy, Test Validity, Factor Analysis
Nedungadi, Sachin; Rinco Michels, Olga; Kreke, Patricia J.; Raker, Jeffrey R.; Murphy, Kristen L. – Journal of Chemical Education, 2022
Practice examinations developed at the ACS Examinations Institute ask students to self-report mental effort when answering items. This self-reported mental effort together with performance can be represented in the form of a cognitive efficiency graph for each student giving information on the utilization of cognitive resources and content…
Descriptors: Cognitive Processes, Science Tests, Test Items, Difficulty Level
Rekha; Shakeela K. – Journal on School Educational Technology, 2025
The main objective of the present study was to construct and standardize an achievement test in science for the secondary school science students in grade 8. An achievement test having 120 test items was prepared by the facilitator based on the four main learning objectives of teaching science that are knowledge, understanding, application, and…
Descriptors: Test Construction, Standardized Tests, Secondary School Students, Science Achievement
Jun-ichiro Yasuda; Michael M. Hull; Naohiro Mae; Kentaro Kojima – Physical Review Physics Education Research, 2025
Although conceptual assessment tests are commonly administered at the beginning and end of a semester, this pre-post approach has inherent limitations. Specifically, education researchers and instructors have limited ability to observe the progression of students' conceptual understanding throughout the course. Furthermore, instructors are limited…
Descriptors: Computer Assisted Testing, Adaptive Testing, Science Tests, Scientific Concepts
David G. Schreurs; Jaclyn M. Trate; Shalini Srinivasan; Melonie A. Teichert; Cynthia J. Luxford; Jamie L. Schneider; Kristen L. Murphy – Chemistry Education Research and Practice, 2024
With the already widespread nature of multiple-choice assessments and the increasing popularity of answer-until-correct, it is important to have methods available for exploring the validity of these types of assessments as they are developed. This work analyzes a 20-question multiple choice assessment covering introductory undergraduate chemistry…
Descriptors: Multiple Choice Tests, Test Validity, Introductory Courses, Science Tests
Pentecost, Thomas C.; Raker, Jeffery R.; Murphy, Kristen L. – Practical Assessment, Research & Evaluation, 2023
Using multiple versions of an assessment has the potential to introduce item environment effects. These types of effects result in version dependent item characteristics (i.e., difficulty and discrimination). Methods to detect such effects and resulting implications are important for all levels of assessment where multiple forms of an assessment…
Descriptors: Item Response Theory, Test Items, Test Format, Science Tests
E.?B. Merki; S.?I. Hofer; A. Vaterlaus; A. Lichtenberger – Physical Review Physics Education Research, 2025
When describing motion in physics, the selection of a frame of reference is crucial. The graph of a moving object can look quite different based on the frame of reference. In recent years, various tests have been developed to assess the interpretation of kinematic graphs, but none of these tests have specifically addressed differences in reference…
Descriptors: Graphs, Motion, Physics, Secondary School Students
Vy Le; Jayson M. Nissen; Xiuxiu Tang; Yuxiao Zhang; Amirreza Mehrabi; Jason W. Morphew; Hua Hua Chang; Ben Van Dusen – Physical Review Physics Education Research, 2025
In physics education research, instructors and researchers often use research-based assessments (RBAs) to assess students' skills and knowledge. In this paper, we support the development of a mechanics cognitive diagnostic to test and implement effective and equitable pedagogies for physics instruction. Adaptive assessments using cognitive…
Descriptors: Physics, Science Education, Scientific Concepts, Diagnostic Tests
Yasuda, Jun-ichiro; Hull, Michael M.; Mae, Naohiro – Physical Review Physics Education Research, 2023
We aim to graphically analyze the depth of conceptual understanding behind the Force Concept Inventory (FCI) responses of students, focusing on three questions (questions 1, 15, and 28). In our study, we created and implemented subquestions to clarify and quantify the students' reasoning steps in reaching their responses to the original FCI…
Descriptors: Scientific Concepts, Concept Formation, Misconceptions, Visual Aids
Grace C. Tetschner; Sachin Nedungadi – Chemistry Education Research and Practice, 2025
Many undergraduate chemistry students hold alternate conceptions related to resonance--an important and fundamental topic of organic chemistry. To help address these alternate conceptions, an organic chemistry instructor could administer the resonance concept inventory (RCI), which is a multiple-choice assessment that was designed to identify…
Descriptors: Scientific Concepts, Concept Formation, Item Response Theory, Scores

Peer reviewed
Direct link
