Publication Date
| In 2026 | 12 |
| Since 2025 | 958 |
| Since 2022 (last 5 years) | 4567 |
| Since 2017 (last 10 years) | 10500 |
| Since 2007 (last 20 years) | 21963 |
Descriptor
| Test Validity | 21786 |
| Validity | 13791 |
| Test Reliability | 10864 |
| Foreign Countries | 9887 |
| Test Construction | 6897 |
| Factor Analysis | 5761 |
| Measures (Individuals) | 5633 |
| Predictive Validity | 5022 |
| Psychometrics | 4820 |
| Reliability | 4635 |
| Correlation | 4376 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 1169 |
| Practitioners | 629 |
| Teachers | 336 |
| Administrators | 165 |
| Policymakers | 110 |
| Counselors | 63 |
| Students | 63 |
| Parents | 15 |
| Community | 12 |
| Media Staff | 10 |
| Support Staff | 8 |
| More ▼ | |
Location
| Turkey | 1397 |
| Australia | 705 |
| Canada | 626 |
| China | 528 |
| United States | 439 |
| Indonesia | 389 |
| United Kingdom | 363 |
| Germany | 340 |
| California | 338 |
| Netherlands | 336 |
| Spain | 311 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 7 |
| Meets WWC Standards with or without Reservations | 12 |
| Does not meet standards | 10 |
Shirong Chen; Chao Han – Asia-Pacific Education Researcher, 2025
In rater-mediated assessment, annotating refers to a process in which raters note, comment, and/or mark on scripts while assessing. While annotations are studied as written corrective feedback that enhances learners' performance, their impact on raters' assessment processes and outcomes remains unclear. We conducted a quasi-experiment to explore…
Descriptors: Documentation, Peer Evaluation, Feedback (Response), Cognitive Processes
Jones, Brett D.; Khajavy, Gholam Hassan; Li, Ming; Mohamed, Hanaa Ezzat; Reilly, Peter – SAGE Open, 2023
This study examined whether the five scales of the MUSIC Model of Academic Motivation Inventory produced valid scores when used in university English language courses across four different countries. We surveyed 1,147 students in English language courses in Iran, Mexico, China, and Egypt and analyzed their responses by performing measurement…
Descriptors: Learning Motivation, Student Attitudes, Second Language Learning, Second Language Instruction
Aryadoust, Vahid – Language Testing, 2023
Construct validity and building validity arguments are some of the main challenges facing the language assessment community. The notion of construct validity and validity arguments arose from research in psychological assessment and developed into the gold standard of validation/validity research in language assessment. At a theoretical level,…
Descriptors: Testing Problems, Test Validity, Second Language Learning, Construct Validity
Er, Zübeyde; Dinç Artut, Perihan; Bal, Ayten Pinar – Pegem Journal of Education and Instruction, 2023
The aim of this research is to develop a reliable and valid scale to determine the mathematical thinking skills of gifted students. In addition, with the developed scale, thinking skills of gifted students was examined in terms of various variables. In this context, the research was carried out on two different study groups. The first stage of…
Descriptors: Measures (Individuals), Rating Scales, Test Construction, Construct Validity
Im, Gwan-Hyeok; Shin, Dongil; Cheng, Liying – Language Testing in Asia, 2019
Purpose and background: The purpose of this paper is to critically review the traditional and contemporary validation frameworks--the content, criterion, and construct validations; the evidence-gathering; the socio-cognitive model; the test usefulness; and an argument-based approach--as well as empirical studies using an argument-based approach to…
Descriptors: Language Tests, Test Validity, Content Validity, Construct Validity
Rutten, Roel – Sociological Methods & Research, 2022
Applying qualitative comparative analysis (QCA) to large Ns relaxes researchers' case-based knowledge. This is problematic because causality in QCA is inferred from a dialogue between empirical, theoretical, and case-based knowledge. The lack of case-based knowledge may be remedied by various robustness tests. However, being a case-based method,…
Descriptors: Comparative Analysis, Correlation, Case Studies, Attribution Theory
Rohmad; Dharin, Abu; Azis, Donny Khoirul – Pegem Journal of Education and Instruction, 2022
This study aimed to find a valid and reliable self-assessment procedure and instrument to measure the spiritual and social attitude domain of Belief and Morality (Aqidah Akhlak) subject in Madrasah Tsanawiyah by implementing Borg & Gall's research and development model. The instruments' content validity was analyzed using the Aiken formula,…
Descriptors: Foreign Countries, Self Evaluation (Individuals), Attitude Measures, Social Attitudes
Öz, Serap; Özdemir, Ali – International Journal of Contemporary Educational Research, 2022
The purpose of this study is to develop a valid and reliable Likert-type scale that can be used to measure the data literacy skills of educators. In the development process of the scale, after reviewing the relevant literature, a pool of 130 items was designed and presented to the experts for their view. After the evaluation of experts, the…
Descriptors: Likert Scales, Test Construction, Construct Validity, Test Reliability
D. Princiotta; K. Caspary – SRI Education, a Division of SRI International, 2022
YouthTruth is a national survey project that harnesses student and stakeholder feedback to help guide decision-making by school leaders and education funders. With a grant from the Fund for Shared Insight, YouthTruth partnered with SRI Education to examine the relationship between key student experience scales and school-level academic and…
Descriptors: Elementary Secondary Education, Student Surveys, Student Experience, Outcomes of Education
Pablo Robles-García; Stuart McLean; Jeffrey Stewart; Ji-young Shin; Claudia Helena Sánchez-Gutiérrez – Language Assessment Quarterly, 2024
Recent literature in the field of L2 vocabulary assessment has advocated for the development of written receptive vocabulary tests such as Vocabulary Levels Tests (VLTs) that use: (a) meaning-recall item formats, (b) a minimum of 40 item counts per 1,000-frequency band to improve level estimates, and (c) lemmas (not word-families) as the lexical…
Descriptors: Spanish, Test Validity, Test Construction, Vocabulary Development
Agustina, Eka Nurmala Sari; Widadah, Soffil; Nisa, Putri Afinanun – Mathematics Teaching Research Journal, 2021
Education currently only prioritizes mastery of scientific aspects and students' intelligence. Math problems are still related to fictitious general knowledge. For this reason, local wisdom-based learning is needed whose learning is packaged using objects, events, and various things that are close to students' lives to raise the local potential of…
Descriptors: Mathematics Instruction, Problem Solving, Indigenous Knowledge, Values Education
Herman, Keith C.; Reinke, Wendy M.; Huang, Francis L.; Thompson, Aaron M.; Doyle-Barker, Levi – School Psychology, 2021
Early adolescence represents a critical developmental period for the identification, prevention, and early intervention of mental health concerns. The Early Identification System--Student Report (EIS-SR) was developed as a user-friendly, accessible, and cost-efficient method for identifying youth at risk for mental health concerns. The present…
Descriptors: Psychometrics, Identification, Screening Tests, Middle School Students
Kittelman, Angus; Mercer, Sterett H.; McIntosh, Kent; Nese, Rhonda N. T. – Grantee Submission, 2021
To identify the most effective strategies for implementing and sustaining Tier 2 and 3 behavior support systems, a measure of general and tier-specific factors hypothesized to predict sustained implementation is needed. To address this need, we conducted two studies examining the construct validity of the "Advanced Level Tier Interventions…
Descriptors: Positive Behavior Supports, Measures (Individuals), Test Construction, Construct Validity
André G. Bateman; Nicholas D. Myers; Deborah Feltz; Karin A. Pfeiffer; Kimberly Kelly; Alan L. Smith; Seungmin Lee; Adam McMahon; Isaac Prilleltensky; Ora Prilleltensky; Ahnalee M. Brincks – Measurement in Physical Education and Exercise Science, 2024
The purpose of this study was to explore the validity of evidence for self-efficacy to regulate physical activity scale (SERPA) measurement using an exploratory latent variable approach. The objectives were to explore the dimensionality, temporal invariance, and external validity of scores produced by the SERPA, a modified version of the barriers…
Descriptors: Adults, Obesity, Physical Activity Level, Self Management
Katie J. Robinson; David R. Lubans; Myrto F. Mavilidi; Francisco B. Ortega; Nicholas Riley – Measurement in Physical Education and Exercise Science, 2024
The purpose of our study was to assess the test-retest reliability and concurrent validity of the 30 sec Sit to Stand test in a sample of adolescents. We recruited 30 male (58%) and 22 female (42%) participants (mean age = 15.77 years ± 0.46). Participants completed the 30-sec Sit to Stand and standing long jump tests on two occasions separated by…
Descriptors: Test Validity, Adolescents, Psychomotor Skills, Test Reliability

Peer reviewed
Direct link
