Publication Date
| In 2026 | 3 |
| Since 2025 | 656 |
| Since 2022 (last 5 years) | 3157 |
| Since 2017 (last 10 years) | 7398 |
| Since 2007 (last 20 years) | 15036 |
Descriptor
| Test Reliability | 15028 |
| Test Validity | 10265 |
| Reliability | 9757 |
| Foreign Countries | 7137 |
| Test Construction | 4821 |
| Validity | 4191 |
| Measures (Individuals) | 3876 |
| Factor Analysis | 3822 |
| Psychometrics | 3520 |
| Interrater Reliability | 3124 |
| Correlation | 3039 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1326 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 251 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 216 |
| California | 214 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Maestrales, Sarah; Zhai, Xiaoming; Touitou, Israel; Baker, Quinton; Schneider, Barbara; Krajcik, Joseph – Journal of Science Education and Technology, 2021
In response to the call for promoting three-dimensional science learning (NRC, 2012), researchers argue for developing assessment items that go beyond rote memorization tasks to ones that require deeper understanding and the use of reasoning that can improve science literacy. Such assessment items are usually performance-based constructed…
Descriptors: Artificial Intelligence, Scoring, Evaluation Methods, Chemistry
Lenz, A. Stephen; Ho, Chia-Min; Rocha, Lauren; Aras, Yahyahan – Measurement and Evaluation in Counseling and Development, 2021
This study examined the degree that reliability coefficients for scores on the PTGI generalize across participant and study characteristics. Meta-analytic procedures resulted in observed and predicted mean alpha coefficients ranging from acceptable to excellent and appeared to be largely unrelated to the participant characteristics included in our…
Descriptors: Generalization, Test Reliability, Scores, Measures (Individuals)
Gwet, Kilem L. – Educational and Psychological Measurement, 2021
Cohen's kappa coefficient was originally proposed for two raters only, and it later extended to an arbitrarily large number of raters to become what is known as Fleiss' generalized kappa. Fleiss' generalized kappa and its large-sample variance are still widely used by researchers and were implemented in several software packages, including, among…
Descriptors: Sample Size, Statistical Analysis, Interrater Reliability, Computation
Pérez-Castilla, Alejandro; Fernandes, John F. T.; Rojas, F. Javier; García-Ramos, Amador – Measurement in Physical Education and Exercise Science, 2021
This study explored the influence of different take-off thresholds on the reliability and magnitude of countermovement jump (CMJ) performance variables. Twenty-three men were tested on two separate sessions. CMJ performance variables were obtained against three external loads (0.5-30-60 kg) using three take-off thresholds: 10 N (arbitrary value of…
Descriptors: Physical Activities, Performance Tests, Reliability, College Students
Shin, Wonho; Park, Jongwon – International Journal of Science and Mathematics Education, 2021
The objective of this study was to understand behavioral characteristics of creative physicists during their growth period, and we want to use any insight gained to help teachers and parents encourage students' creativity in their everyday life. To do this, the critical incident technique was utilized to extract behavioral traits from six…
Descriptors: Psychological Characteristics, Behavior, Physics, Creativity
Steele, Catriona M.; Peladeau-Pigeon, Melanie; Nagy, Ahmed; Waito, Ashley A. – Journal of Speech, Language, and Hearing Research, 2020
Purpose: The field lacks consensus about preferred metrics for capturing pharyngeal residue on videofluoroscopy. We explored four different methods, namely, the visuoperceptual Eisenhuber scale and three pixel-based methods: (a) residue area divided by vallecular or pyriform sinus spatial housing ("%-Full"), (b) the Normalized Residue…
Descriptors: Human Body, Physiology, Speech Language Pathology, Measurement Techniques
Rohlfing, Ingo – Field Methods, 2020
Empirical researchers using qualitative comparative analysis (QCA) can work with crisp, multivalue, and fuzzy sets. The relative advantages of crisp and multivalue sets have been discussed in the QCA literature. There has been little reflection on the more frequent decision between crisp and fuzzy sets for which there often is no theoretical…
Descriptors: Qualitative Research, Comparative Analysis, Reliability, Classification
Hosseinali Gholami – Mathematics Teaching Research Journal, 2025
Scoring mathematics exam papers accurately is vital for fostering students' engagement and interest in the subject. Incorrect scoring practices can erode motivation and lead to the development of false self-confidence. Therefore, the implementation of appropriate scoring methods is essential for the success of mathematics education. This study…
Descriptors: Interrater Reliability, Mathematics Teachers, Scoring, Mathematics Tests
Amalia Sapriati; Mestika Sekarwinahyu; Maya Puspitasari; Fitria Amastini – Open Education Studies, 2025
This study presents a new instrument for assessing reflective thinking, self-efficacy, and self-regulated learning (SRL) among Universitas Terbuka postgraduate students registered in an online Research Methods course. The study validates the instrument's dependability (Cronbach's alpha = 0.918) and builds a strong factor structure by means of…
Descriptors: Test Construction, Test Validity, Reflection, Thinking Skills
Serkan Bengisu; Özlem Öge-Dasdögen; Rosemary Martino – International Journal of Language & Communication Disorders, 2025
Purpose: The most common cause of death in Turkey is attributed to vascular diseases, including stroke. Dysphagia stands out as one of the prevalent and life-threatening complications that often follow a stroke. Within the Turkish context, the availability of validated bedside screening tests for assessing dysphagia remains limited. The primary…
Descriptors: Foreign Countries, Eating Disorders, Screening Tests, Neurological Impairments
Yongzhong Yang; Haoran Xu – Journal of Creative Behavior, 2025
With the rapid advancement of artificial intelligence (AI), AI creativity has demonstrated significant potential for application across various fields. This study aims to explore the multidimensional characteristics of AI creativity from the audience's perspective and to develop a corresponding measurement scale. Specifically, Study 1 utilized…
Descriptors: Artificial Intelligence, Creativity, Measures (Individuals), Test Construction
Eun-Young Park – Journal of Applied Research in Intellectual Disabilities, 2025
Background: Individuals with intellectual disabilities are as vulnerable to depression as their typically developing peers. This study aimed to verify the reliability and validity of the Center for Epidemiologic Studies Depression Scale (CES-D) in individuals with intellectual disabilities and determine whether the scale is appropriate for…
Descriptors: Factor Structure, Factor Analysis, Depression (Psychology), Symptoms (Individual Disorders)
Tuba Gezer; Stella Y. Kim; Othelia EunKyoung Lee – Journal of Computing in Higher Education, 2025
Considering the rise of online education and an increasing number of students with disabilities in higher education, examining the validity of the Self-efficacy Questionnaire for Online Learning (SeQoL) for students with disabilities is warranted. The purpose of this study is to examine the reliability and validity of (SeQoL; Shen et al., 2013)…
Descriptors: College Students, Students with Disabilities, Test Validity, Self Efficacy
Yuane Jia; Amy B. Spagnolo; Nora Barrett; Ann A. Murphy; Peter M. Basto; Pamela Rothpletz-Puglia; Stuart Luther – Educational Technology Research and Development, 2025
The benefits of peer evaluation of teaching effectiveness and quality in higher education are well documented. While instruments exist for the review and evaluation of entire online courses, there is no standardized single-lesson, peer evaluation instrument available for online instruction. This pilot study focused on the validation of a peer…
Descriptors: Higher Education, Peer Evaluation, Lesson Observation Criteria, Test Construction
Jesús Manuel Soriano-Alcantara; Francisco D. Guillén-Gámez; Julio Ruiz-Palmero – Technology, Knowledge and Learning, 2025
Digital competencies are very significant in terms of integrating digital resources into educational processes. This study presents the validity and reliability of an instrument created by Carrera et al. (2011), in order to evaluate the basic digital competence of the three main educational agents of the educational community (teachers, students,…
Descriptors: Foreign Countries, Test Validity, Test Reliability, Digital Literacy

Peer reviewed
Direct link
