Publication Date
| In 2026 | 3 |
| Since 2025 | 666 |
| Since 2022 (last 5 years) | 3167 |
| Since 2017 (last 10 years) | 7408 |
| Since 2007 (last 20 years) | 15046 |
Descriptor
| Test Reliability | 15036 |
| Test Validity | 10272 |
| Reliability | 9759 |
| Foreign Countries | 7141 |
| Test Construction | 4823 |
| Validity | 4191 |
| Measures (Individuals) | 3877 |
| Factor Analysis | 3825 |
| Psychometrics | 3525 |
| Interrater Reliability | 3124 |
| Correlation | 3039 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1327 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 252 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 216 |
| California | 214 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
An, Mihee; Nord, Jayden; Koziol, Natalie A.; Dusing, Stacey C.; Kane, Audrey E.; Lobo, Michele A.; McCoy, Sarah W.; Harbourne, Regina T. – Grantee Submission, 2021
Aim: To describe the development of an intervention-specific fidelity measure and its utilization and to determine whether the newly developed Sitting Together and Reaching to Play (START-Play) intervention was implemented as intended. Also, to quantify differences between START-Play and usual early intervention (uEI) services. Method: A fidelity…
Descriptors: Test Construction, Measures (Individuals), Fidelity, Early Intervention
Edwards, Ashley A.; Joyner, Keanan J.; Schatschneider, Christopher – Educational and Psychological Measurement, 2021
The accuracy of certain internal consistency estimators have been questioned in recent years. The present study tests the accuracy of six reliability estimators (Cronbach's alpha, omega, omega hierarchical, Revelle's omega, and greatest lower bound) in 140 simulated conditions of unidimensional continuous data with uncorrelated errors with varying…
Descriptors: Reliability, Computation, Accuracy, Sample Size
McLeod, Justin W.H.; McCrimmon, Adam W. – Journal of Psychoeducational Assessment, 2021
The "Raven's 2 Progressive Matrices Clinical Edition" (Raven's 2; Raven, Rust, Chan, & Zhou, 2018), published by NCS Pearson, is an individually administered nonverbal assessment of general cognitive ability developed to measure "educative abilities," defined as the ability to think clearly and solve complex problems in…
Descriptors: Test Reviews, Intelligence Tests, Testing, Test Reliability
Lee, Yi-Hsuan; Haberman, Shelby J. – Journal of Educational Measurement, 2021
For assessments that use different forms in different administrations, equating methods are applied to ensure comparability of scores over time. Ideally, a score scale is well maintained throughout the life of a testing program. In reality, instability of a score scale can result from a variety of causes, some are expected while others may be…
Descriptors: Scores, Regression (Statistics), Demography, Data
Maestrales, Sarah; Zhai, Xiaoming; Touitou, Israel; Baker, Quinton; Schneider, Barbara; Krajcik, Joseph – Journal of Science Education and Technology, 2021
In response to the call for promoting three-dimensional science learning (NRC, 2012), researchers argue for developing assessment items that go beyond rote memorization tasks to ones that require deeper understanding and the use of reasoning that can improve science literacy. Such assessment items are usually performance-based constructed…
Descriptors: Artificial Intelligence, Scoring, Evaluation Methods, Chemistry
Lenz, A. Stephen; Ho, Chia-Min; Rocha, Lauren; Aras, Yahyahan – Measurement and Evaluation in Counseling and Development, 2021
This study examined the degree that reliability coefficients for scores on the PTGI generalize across participant and study characteristics. Meta-analytic procedures resulted in observed and predicted mean alpha coefficients ranging from acceptable to excellent and appeared to be largely unrelated to the participant characteristics included in our…
Descriptors: Generalization, Test Reliability, Scores, Measures (Individuals)
Gwet, Kilem L. – Educational and Psychological Measurement, 2021
Cohen's kappa coefficient was originally proposed for two raters only, and it later extended to an arbitrarily large number of raters to become what is known as Fleiss' generalized kappa. Fleiss' generalized kappa and its large-sample variance are still widely used by researchers and were implemented in several software packages, including, among…
Descriptors: Sample Size, Statistical Analysis, Interrater Reliability, Computation
Pérez-Castilla, Alejandro; Fernandes, John F. T.; Rojas, F. Javier; García-Ramos, Amador – Measurement in Physical Education and Exercise Science, 2021
This study explored the influence of different take-off thresholds on the reliability and magnitude of countermovement jump (CMJ) performance variables. Twenty-three men were tested on two separate sessions. CMJ performance variables were obtained against three external loads (0.5-30-60 kg) using three take-off thresholds: 10 N (arbitrary value of…
Descriptors: Physical Activities, Performance Tests, Reliability, College Students
Shin, Wonho; Park, Jongwon – International Journal of Science and Mathematics Education, 2021
The objective of this study was to understand behavioral characteristics of creative physicists during their growth period, and we want to use any insight gained to help teachers and parents encourage students' creativity in their everyday life. To do this, the critical incident technique was utilized to extract behavioral traits from six…
Descriptors: Psychological Characteristics, Behavior, Physics, Creativity
Steele, Catriona M.; Peladeau-Pigeon, Melanie; Nagy, Ahmed; Waito, Ashley A. – Journal of Speech, Language, and Hearing Research, 2020
Purpose: The field lacks consensus about preferred metrics for capturing pharyngeal residue on videofluoroscopy. We explored four different methods, namely, the visuoperceptual Eisenhuber scale and three pixel-based methods: (a) residue area divided by vallecular or pyriform sinus spatial housing ("%-Full"), (b) the Normalized Residue…
Descriptors: Human Body, Physiology, Speech Language Pathology, Measurement Techniques
Rohlfing, Ingo – Field Methods, 2020
Empirical researchers using qualitative comparative analysis (QCA) can work with crisp, multivalue, and fuzzy sets. The relative advantages of crisp and multivalue sets have been discussed in the QCA literature. There has been little reflection on the more frequent decision between crisp and fuzzy sets for which there often is no theoretical…
Descriptors: Qualitative Research, Comparative Analysis, Reliability, Classification
Hosseinali Gholami – Mathematics Teaching Research Journal, 2025
Scoring mathematics exam papers accurately is vital for fostering students' engagement and interest in the subject. Incorrect scoring practices can erode motivation and lead to the development of false self-confidence. Therefore, the implementation of appropriate scoring methods is essential for the success of mathematics education. This study…
Descriptors: Interrater Reliability, Mathematics Teachers, Scoring, Mathematics Tests
Amalia Sapriati; Mestika Sekarwinahyu; Maya Puspitasari; Fitria Amastini – Open Education Studies, 2025
This study presents a new instrument for assessing reflective thinking, self-efficacy, and self-regulated learning (SRL) among Universitas Terbuka postgraduate students registered in an online Research Methods course. The study validates the instrument's dependability (Cronbach's alpha = 0.918) and builds a strong factor structure by means of…
Descriptors: Test Construction, Test Validity, Reflection, Thinking Skills
Serkan Bengisu; Özlem Öge-Dasdögen; Rosemary Martino – International Journal of Language & Communication Disorders, 2025
Purpose: The most common cause of death in Turkey is attributed to vascular diseases, including stroke. Dysphagia stands out as one of the prevalent and life-threatening complications that often follow a stroke. Within the Turkish context, the availability of validated bedside screening tests for assessing dysphagia remains limited. The primary…
Descriptors: Foreign Countries, Eating Disorders, Screening Tests, Neurological Impairments
Yongzhong Yang; Haoran Xu – Journal of Creative Behavior, 2025
With the rapid advancement of artificial intelligence (AI), AI creativity has demonstrated significant potential for application across various fields. This study aims to explore the multidimensional characteristics of AI creativity from the audience's perspective and to develop a corresponding measurement scale. Specifically, Study 1 utilized…
Descriptors: Artificial Intelligence, Creativity, Measures (Individuals), Test Construction

Peer reviewed
Direct link
