Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Trace, Jonathan – Language Testing, 2020
Originally designed to measure reading and passage comprehension in L1 readers, cloze tests continue to be used for L2 assessment purposes. However, there remain disputes about whether or not cloze items can measure beyond local comprehension information, as well as whether or not they are purely a test of reading alone, or if performance can be…
Descriptors: Cloze Procedure, Second Language Learning, Reading Comprehension, Native Language
Kara, Hakan; Cetin, Sevda – International Journal of Assessment Tools in Education, 2020
In this study, the efficiency of various random sampling methods to reduce the number of items rated by judges in an Angoff standard-setting study was examined and the methods were compared with each other. Firstly, the full-length test was formed by combining Placement Test 2012 and 2013 mathematics subsets. After then, simple random sampling…
Descriptors: Cutting Scores, Standard Setting (Scoring), Sampling, Error of Measurement
Parry, James R. – Online Submission, 2020
This paper presents research and provides a method to ensure that parallel assessments, that are generated from a large test-item database, maintain equitable difficulty and content coverage each time the assessment is presented. To maintain fairness and validity it is important that all instances of an assessment, that is intended to test the…
Descriptors: Culture Fair Tests, Difficulty Level, Test Items, Test Validity
Rachel A. Gross – ProQuest LLC, 2020
The present study was motivated by the theory-method mismatch between heterotypic continuity (aspects of development that manifest differently across the lifespan thus cannot be measured the same way over time) and longitudinal measurement equivalence (the statistical assumption that the developmental phenomenon studied is measured on the same…
Descriptors: Robustness (Statistics), Structural Equation Models, Longitudinal Studies, Error of Measurement
Kim, Hyun-Kyung; Kim, Haesun A. – International Journal of Science and Mathematics Education, 2022
The study aims to analyze student responses to chemistry constructed response items to obtain detailed information on science NAEA (National Assessment of Educational Achievement) in South Korea and to draw suggestions for enhancing curriculum, teaching, and learning. For this purpose, we analyzed 7444 answers that could be generalized as 1.29% of…
Descriptors: Foreign Countries, Science Achievement, Science Tests, Teaching Methods
Saepuzaman, Duden; Istiyono, Edi; Haryanto – Pegem Journal of Education and Instruction, 2022
HOTS is one part of the skills that need to be developed in the 21st Century . This study aims to determine the characteristics of the Fundamental Physics Higher-order Thinking Skill (FundPhysHOTS) test for prospective physics teachers using Item Response Theory (IRT) analysis. This study uses a quantitative approach. 254 prospective physics…
Descriptors: Thinking Skills, Physics, Science Process Skills, Cognitive Tests
Tsuda, Emi; Ward, Phillip; Sazama, Debra; He, Yaohui; Lehwald, Harry; Ko, Bomna; Santiago, José A.; Xie, Xiuye – Physical Educator, 2022
The purpose of this study was to create a valid and reliable volleyball common content knowledge (VB-CCK) test in secondary physical education contexts in the United States. Two physical education teacher educators served as content experts and developed test items for the VB-CCK test. We then established content validity with a group of…
Descriptors: Team Sports, Knowledge Level, Test Validity, Test Reliability
Koch, Marco; Spinath, Frank M.; Greiff, Samuel; Becker, Nicolas – Journal of Intelligence, 2022
Figural matrices tasks are one of the most prominent item formats used in intelligence tests, and their relevance for the assessment of cognitive abilities is unquestionable. However, despite endeavors of the open science movement to make scientific research accessible on all levels, there is a lack of royalty-free figural matrices tests. The Open…
Descriptors: Intelligence, Intelligence Tests, Computer Assisted Testing, Test Items
Salami, Sedigheh; Bandeira, Paulo Felipe Ribeiro; Gomes, Cristiano Mauro Assis; Dehkordi, Parvaneh Shamsipour – Journal of Motor Learning and Development, 2022
Aim: To examine the latent structure of the "Test of Gross Motor Development--Third Edition" (TGMD-3) with a bifactor modeling approach. In addition, the study examines the dimensionality and model-based reliability of general and specific contributions of the test's subscales and measurement invariance of the TGMD-3. Methods: A…
Descriptors: Children, Norm Referenced Tests, Motor Development, Psychomotor Skills
Krzic, Maja; Brown, Sandra – Natural Sciences Education, 2022
The transition of our large ([approximately]300 student) introductory soil science course to the online setting created several challenges, including engaging first- and second-year students, providing meaningful hands-on learning activities, and setting up online exams. The objective of this paper is to describe the development and use of…
Descriptors: Introductory Courses, Social Sciences, Online Courses, Educational Change
Foster, Colin – International Journal of Science and Mathematics Education, 2022
Confidence assessment (CA) involves students stating alongside each of their answers a confidence rating (e.g. 0 low to 10 high) to express how certain they are that their answer is correct. Each student's score is calculated as the sum of the confidence ratings on the items that they answered correctly, minus the sum of the confidence ratings on…
Descriptors: Mathematics Tests, Mathematics Education, Secondary School Students, Meta Analysis
Çetin, Münevver; Karaokur Akdag, Seyma – Journal of Education and Learning, 2022
The aim of this study was to develop a scaling instrument for measuring organizational development level in the Turkish higher education context depending on perceptions of the faculty. The sample consisted of academicians of higher education institutions in the 2020-2021 academic year. Data were gathered in two stages. Exploratory Factor Analysis…
Descriptors: Organizational Development, Likert Scales, Measures (Individuals), Test Construction
Russell, Michael; Szendey, Olivia; Li, Zhushan – Educational Assessment, 2022
Recent research provides evidence that an intersectional approach to defining reference and focal groups results in a higher percentage of comparisons flagged for potential DIF. The study presented here examined the generalizability of this pattern across methods for examining DIF. While the level of DIF detection differed among the four methods…
Descriptors: Comparative Analysis, Item Analysis, Test Items, Test Construction
He, Wei – NWEA, 2022
To ensure that student academic growth in a subject area is accurately captured, it is imperative that the underlying scale remains stable over time. As item parameter stability constitutes one of the factors that affects scale stability, NWEA® periodically conducts studies to check for the stability of the item parameter estimates for MAP®…
Descriptors: Achievement Tests, Test Items, Test Reliability, Academic Achievement
Tomkowicz, Joanna; Kim, Dong-In; Wan, Ping – Online Submission, 2022
In this study we evaluated the stability of item parameters and student scores, using the pre-equated (pre-pandemic) parameters from Spring 2019 and post-equated (post-pandemic) parameters from Spring 2021 in two calibration and equating designs related to item parameter treatment: re-estimating all anchor parameters (Design 1) and holding the…
Descriptors: Equated Scores, Test Items, Evaluation Methods, Pandemics

Peer reviewed
Direct link
