Publication Date
In 2025 | 3 |
Since 2024 | 5 |
Since 2021 (last 5 years) | 17 |
Since 2016 (last 10 years) | 43 |
Since 2006 (last 20 years) | 67 |
Descriptor
Test Reliability | 143 |
Tests | 143 |
Test Validity | 88 |
Test Construction | 46 |
Foreign Countries | 34 |
Scores | 23 |
Correlation | 22 |
Factor Analysis | 17 |
Test Items | 17 |
Testing | 17 |
Psychometrics | 16 |
More ▼ |
Source
Author
Publication Type
Reports - Research | 143 |
Journal Articles | 84 |
Tests/Questionnaires | 13 |
Speeches/Meeting Papers | 4 |
Guides - Non-Classroom | 3 |
Information Analyses | 3 |
Numerical/Quantitative Data | 1 |
Education Level
Audience
Administrators | 3 |
Practitioners | 3 |
Teachers | 3 |
Researchers | 1 |
Location
Turkey | 7 |
United Kingdom | 4 |
Canada | 3 |
Germany | 3 |
Taiwan | 3 |
Australia | 2 |
Jordan | 2 |
Netherlands | 2 |
United Kingdom (Scotland) | 2 |
Austria | 1 |
Belgium | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Sella-Weiss, Oshrat – International Journal of Language & Communication Disorders, 2023
Background: Quantitative measures can increase precision in describing swallowing function, improve interrater and test-retest reliability, and advance clinical decision-making. The Test of Mastication and Swallowing Solids (TOMASS) and the Timed Water Swallow Test (TWST) are functional tests for swallowing that provide quantitative results. Aims:…
Descriptors: Human Body, Motor Reactions, Tests, Test Reliability
Joshua B. Gilbert; James G. Soland; Benjamin W. Domingue – Annenberg Institute for School Reform at Brown University, 2025
Value-Added Models (VAMs) are both common and controversial in education policy and accountability research. While the sensitivity of VAMs to model specification and covariate selection is well documented, the extent to which test scoring methods (e.g., mean scores vs. IRT-based scores) may affect VA estimates is less studied. We examine the…
Descriptors: Value Added Models, Tests, Testing, Scoring
Yücel Makaraci; Kazim Nas; Kerem Gündüz; Abdullah Uysal; Samuel T. Orange; Juan D. Ruiz-Cárdenas – Measurement in Physical Education and Exercise Science, 2024
The aim was to determine the validity and test-retest reliability of the Sit to Stand App variables (rising time, vertical velocity, and power) for measuring single-leg sit-to-stand (STS) test compared to those derived from ground reaction force data. Twenty-seven female athletes performed the single-leg STS test over three consecutive sessions…
Descriptors: Computer Simulation, Measurement Techniques, Athletics, Physical Fitness
Tugra Karademir Coskun; Ayfer Alper – Digital Education Review, 2024
This study aims to examine the potential differences between teacher evaluations and artificial intelligence (AI) tool-based assessment systems in university examinations. The research has evaluated a wide spectrum of exams including numerical and verbal course exams, exams with different assessment styles (project, test exam, traditional exam),…
Descriptors: Artificial Intelligence, Visual Aids, Video Technology, Tests
Kylie Gorney; Sandip Sinharay – Journal of Educational Measurement, 2025
Although there exists an extensive amount of research on subscores and their properties, limited research has been conducted on categorical subscores and their interpretations. In this paper, we focus on the claim of Feinberg and von Davier that categorical subscores are useful for remediation and instructional purposes. We investigate this claim…
Descriptors: Tests, Scores, Test Interpretation, Alternative Assessment
Fergadiotis, Gerasimos; Casilio, Marianne; Dickey, Michael Walsh; Steel, Stacey; Nicholson, Hannele; Fleegle, Mikala; Swiderski, Alexander; Hula, William D. – Journal of Speech, Language, and Hearing Research, 2023
Purpose: Item response theory (IRT) is a modern psychometric framework with several advantageous properties as compared with classical test theory. IRT has been successfully used to model performance on anomia tests in individuals with aphasia; however, all efforts to date have focused on noun production accuracy. The purpose of this study is to…
Descriptors: Item Response Theory, Psychometrics, Verbs, Naming
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Miranda, Constanza; Goñi, Julian; Pickenpack, Astrid; Sotomayor, Trinidad – International Journal of Technology and Design Education, 2022
K-12 Engineering Education has placed a lot of attention on students' attitudes or predispositions towards science and technology. However, most assessment methods are focused on STEM as a whole or only on technology. In this article, we will discuss the instrument called Technology and Engineering Attitude Scale (TEAS) which focuses on attitudes…
Descriptors: Elementary Secondary Education, Engineering Education, Test Validity, Foreign Countries
Koçak, Duygu – International Journal of Progressive Education, 2020
The aim of this study was to determine the effect of chance success on test equalization. For this purpose, artificially generated 500 and 1000 sample size data sets were synchronized using linear equalization and equal percentage equalization methods. In the data which were produced as a simulative, a total of four cases were created with no…
Descriptors: Test Theory, Equated Scores, Error of Measurement, Sample Size
Liu, Xiaolu; Keating, Xiaofen D. – European Physical Education Review, 2021
Pre-service physical education teachers (PPETs) may be implementing health-related fitness testing (HRFT) in schools in the future. Thus, exploring their attitudes toward HRFT would help us understand physical education (PE) teachers' attitudes toward HRFT. This study investigated PPET attitudes toward HRFT in the USA and the effects of teacher…
Descriptors: Preservice Teachers, Physical Education Teachers, Student Attitudes, Physical Fitness
Alkis Küçükaydin, Mensure; Akkanat, Çigdem – Problems of Education in the 21st Century, 2022
Computational thinking is recognized as a vital skill related to problem-solving in technological and non-technological fields. The existence of different sub-domains related to this skill has been pointed out. Therefore, there is a need for tools that measure these different sub-domains. Because of its structure that includes different skills,…
Descriptors: Elementary School Students, Thinking Skills, Computation, Tests
Atilgan, Hakan; Demir, Elif Kübra; Ogretmen, Tuncay; Basokcu, Tahsin Oguz – International Journal of Progressive Education, 2020
It has become a critical question what the reliability level would be when open-ended questions are used in large-scale selection tests. One of the aims of the present study is to determine what the reliability would be in the event that the answers given by test-takers are scored by experts when open-ended short answer questions are used in…
Descriptors: Foreign Countries, Secondary School Students, Test Items, Test Reliability
Duncan, Michael J.; Richardson, Darren; Morris, Rhys; Eyre, Emma; Clarke, Neil D. – Journal of Motor Learning and Development, 2021
The present study examined the test-retest reliability of the Ghent University dribbling test and short dribble test in a pediatric population. Fifty-four boys aged 9-14 years (mean ± SD = 11 ± 2 years) undertook the Ghent University and dribbling tests on two occasions separated by 2 weeks. Intraclass correlation coefficients, coefficient of…
Descriptors: Pretests Posttests, Test Reliability, Team Sports, Tests
Schmitz, Boris; Pfeifer, Carina; Thorwesten, Lothar; Krüger, Michael; Klose, Andreas; Brand, Stefan-Martin – Research Quarterly for Exercise and Sport, 2020
Purpose: This study analyzed the physiological response during Yo-Yo Intermittent Recovery Level 1 (YYIR1) test and re-test by in-field ergospirometry and time-series analyses of respiratory parameters. Methods: Ten moderately trained males (23.4 ± 2.01 years, VO[subscript 2peak]= 56.81 ± 10.75 mL·kg[superscript -1]·min[superscript -1]) completed…
Descriptors: Exercise Physiology, Males, Physical Activities, Test Validity
Jelicic, Mario; Ivancev, Vladimir; Cular, Dražen; Covic, Nedim; Stojanovic, Emilija; Scanlan, Aaron T.; Milanovic, Zoran – Research Quarterly for Exercise and Sport, 2020
Purpose: The purpose of this study was to determine the reliability, validity, and usefulness of 30--15 Intermittent Fitness Test (30-15[subscript IFT]) in female basketball players. Methods: Nineteen female basketball players (17.82 ± 1.94 yr, 175.4 ± 7.3 cm, 67.9 ± 7.7 kg) competing in the National Croatian League performed one trial of a…
Descriptors: Physical Fitness, Females, Athletes, Team Sports