NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 248 results Save | Export
Sherwin E. Balbuena – Online Submission, 2024
This study introduces a new chi-square test statistic for testing the equality of response frequencies among distracters in multiple-choice tests. The formula uses the information from the number of correct answers and wrong answers, which becomes the basis of calculating the expected values of response frequencies per distracter. The method was…
Descriptors: Multiple Choice Tests, Statistics, Test Validity, Testing
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ali Orhan; Inan Tekin; Sedat Sen – International Journal of Assessment Tools in Education, 2025
In this study, it was aimed to translate and adapt the Computational Thinking Multidimensional Test (CTMT) developed by Kang et al. (2023) into Turkish and to investigate its psychometric qualities with Turkish university students. Following the translation procedures of the CTMT with 12 multiple-choice questions developed based on real-life…
Descriptors: Cognitive Tests, Thinking Skills, Computation, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024
This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…
Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Hojung Kim; Changkyung Song; Jiyoung Kim; Hyeyun Jeong; Jisoo Park – Language Testing in Asia, 2024
This study presents a modified version of the Korean Elicited Imitation (EI) test, designed to resemble natural spoken language, and validates its reliability as a measure of proficiency. The study assesses the correlation between average test scores and Test of Proficiency in Korean (TOPIK) levels, examining score distributions among beginner,…
Descriptors: Korean, Test Validity, Test Reliability, Imitation
Peer reviewed Peer reviewed
Direct linkDirect link
Jerin Kim; Kent McIntosh – Journal of Positive Behavior Interventions, 2025
We aimed to identify empirically valid cut scores on the positive behavioral interventions and supports (PBIS) Tiered Fidelity Inventory (TFI) through an expert panel process known as bookmarking. The TFI is a measurement tool to evaluate the fidelity of implementation of PBIS. In the bookmark method, experts reviewed all TFI items and item scores…
Descriptors: Positive Behavior Supports, Cutting Scores, Fidelity, Program Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
E.?B. Merki; S.?I. Hofer; A. Vaterlaus; A. Lichtenberger – Physical Review Physics Education Research, 2025
When describing motion in physics, the selection of a frame of reference is crucial. The graph of a moving object can look quite different based on the frame of reference. In recent years, various tests have been developed to assess the interpretation of kinematic graphs, but none of these tests have specifically addressed differences in reference…
Descriptors: Graphs, Motion, Physics, Secondary School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Aditya Shah; Ajay Devmane; Mehul Ranka; Prathamesh Churi – Education and Information Technologies, 2024
Online learning has grown due to the advancement of technology and flexibility. Online examinations measure students' knowledge and skills. Traditional question papers include inconsistent difficulty levels, arbitrary question allocations, and poor grading. The suggested model calibrates question paper difficulty based on student performance to…
Descriptors: Computer Assisted Testing, Difficulty Level, Grading, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Menold, Natalja; Raykov, Tenko – Educational and Psychological Measurement, 2022
The possible dependency of criterion validity on item formulation in a multicomponent measuring instrument is examined. The discussion is concerned with evaluation of the differences in criterion validity between two or more groups (populations/subpopulations) that have been administered instruments with items having differently formulated item…
Descriptors: Test Items, Measures (Individuals), Test Validity, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Martin Steinbach; Carolin Eitemüller; Marc Rodemer; Maik Walpuski – International Journal of Science Education, 2025
The intricate relationship between representational competence and content knowledge in organic chemistry has been widely debated, and the ways in which representations contribute to task difficulty, particularly in assessment, remain unclear. This paper presents a multiple-choice test instrument for assessing individuals' knowledge of fundamental…
Descriptors: Organic Chemistry, Difficulty Level, Multiple Choice Tests, Fundamental Concepts
Peer reviewed Peer reviewed
Direct linkDirect link
Apichat Khamboonruang – Language Testing in Asia, 2025
Chulalongkorn University Language Institute (CULI) test was developed as a local standardised test of English for professional and international communication. To ensure that the CULI test fulfils its intended purposes, this study employed Kane's argument-based validation and Rasch measurement approaches to construct the validity argument for the…
Descriptors: Universities, Second Language Learning, Second Language Instruction, Language Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Suwita Suwita; Sulistyo Saputro; Sajidan Sajidan; Sutarno Sutarno – Journal of Baltic Science Education, 2024
The current study uses the Rasch Model to measure lower-secondary school students' critical thinking skills on photosynthesis topics. Critical thinking skills are considered essential in science education, but few valid and practical measurement instruments remain. The current study fills the gap by adapting the instrument from the Watson-Glaser…
Descriptors: Secondary School Students, Critical Thinking, Thinking Skills, Botany
Peer reviewed Peer reviewed
Direct linkDirect link
Ruying Li; Gaofeng Li – International Journal of Science and Mathematics Education, 2025
Systems thinking (ST) is an essential competence for future life and biology learning. Appropriate assessment is critical for collecting sufficient information to develop ST in biology education. This research offers an ST framework based on a comprehensive understanding of biological systems, encompassing four skills across three complexity…
Descriptors: Test Construction, Test Validity, Science Tests, Cognitive Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Y. Yokhebed; Rexy Maulana Dwi Karmadi; Luvia Ranggi Nastiti – Journal of Biological Education Indonesia (Jurnal Pendidikan Biologi Indonesia), 2025
Although self-assessment in critical thinking is thought to help students recognise their strengths and weaknesses, the reliability and validity of the assessment tool is still questionable, so a more objective evaluation is needed. Objective of this investigation is to assess the self-assessment tools in evaluating students' critical thinking…
Descriptors: Self Evaluation (Individuals), Critical Thinking, Science and Society, Test Validity
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Wai Kei Chan; Li Zhang; Emily Pey-Tee Oon – International Journal of Assessment Tools in Education, 2023
We report the validity of a test instrument that assesses the arithmetic ability of primary students by (a) describing the theoretical model of arithmetic ability assessment using Wilson's (2004) four building blocks of constructing measures and (b) providing empirical evidence for the validation study. The instrument consists of 21…
Descriptors: Foreign Countries, Elementary School Students, Arithmetic, Grade 3
Peer reviewed Peer reviewed
Direct linkDirect link
Lyniesha Ward; Fridah Rotich; Jeffrey R. Raker; Regis Komperda; Sachin Nedungadi; Maia Popova – Chemistry Education Research and Practice, 2025
This paper describes the design and evaluation of the Organic chemistry Representational Competence Assessment (ORCA). Grounded in Kozma and Russell's representational competence framework, the ORCA measures the learner's ability to "interpret," "translate," and "use" six commonly used representations of molecular…
Descriptors: Organic Chemistry, Science Tests, Test Construction, Student Evaluation
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  17