Publication Date
| In 2026 | 0 |
| Since 2025 | 621 |
| Since 2022 (last 5 years) | 3121 |
| Since 2017 (last 10 years) | 7362 |
| Since 2007 (last 20 years) | 15000 |
Descriptor
| Test Reliability | 15006 |
| Test Validity | 10245 |
| Reliability | 9748 |
| Foreign Countries | 7119 |
| Test Construction | 4807 |
| Validity | 4189 |
| Measures (Individuals) | 3872 |
| Factor Analysis | 3820 |
| Psychometrics | 3513 |
| Interrater Reliability | 3117 |
| Correlation | 3037 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1319 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 249 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 216 |
| California | 214 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Zhou, Shuqi; Merzdorf, Hillary E.; Douglas, Kerrie A.; Moore, Tamara J. – Journal of Pre-College Engineering Education Research, 2023
This study aimed to develop a K-12 classroom observation protocol to assess K-12 teachers' implementation of science, technology, engineering, and mathematics (STEM) integration. The intended purpose of the observation protocol is for researchers to examine how K-12 teachers implement the STEM integrated curriculum. Based on research on STEM…
Descriptors: Test Construction, Test Validity, STEM Education, Classroom Observation Techniques
Brogan L. Barr; Virginia V. W. McIntosh; Eileen F. Britt; Jennifer Jordan; Janet D. Carter – Measurement: Interdisciplinary Research and Perspectives, 2024
Even when raters demonstrate agreement in the use of a measure, limited score variability or violation of often-ignored statistical assumptions can result in lower reliability estimates than intuitively expected. This article uses data drawn from two randomized controlled trials of schema therapy and cognitive behavioral therapy for the treatment…
Descriptors: Evaluators, Interrater Reliability, Reliability, Measurement Techniques
McCluskey, Sydne – ProQuest LLC, 2023
Rater comparison analysis is commonly necessary in the social sciences. Conventional approaches to the problem generally focus on calculation of agreement statistics, which provide useful but incomplete information about rater agreement. Importantly, one-number agreement statistics give no indication regarding the nature of disagreements, nor do…
Descriptors: Bayesian Statistics, Structural Equation Models, Interrater Reliability, Beliefs
Luu, Kimberly; Sidhu, Ravi; Chadha, Neil K.; Eva, Kevin W. – Advances in Health Sciences Education, 2023
Clinical supervisors are known to assess trainee performance idiosyncratically, causing concern about the validity of their ratings. The literature on this issue relies heavily on retrospective collection of decisions, resulting in the risk of inaccurate information regarding what actually drives raters' perceptions. Capturing in-the-moment…
Descriptors: Clinical Experience, Practicum Supervision, Student Evaluation, Evaluation Methods
Egmose, Ida; Skou, Mia; Madsen, Eva Back; Stuart, Anne Christine; Krogh, Marianne Thode; Haase, Tina Wahl; Vaever, Mette Skovgaard – European Journal of Developmental Psychology, 2023
Mind-mindedness (MM) refers to the parent's ability to treat the child as an individual with a mind of his or her own. Studies have found representational and interactional MM to predict child development, but more research is needed on the validity of representational MM in parents of infants. Therefore, we examine the reliability and validity of…
Descriptors: Individualism, Mothers, Infants, Foreign Countries
Feldberg, Zachary R. – ProQuest LLC, 2023
Cognitive diagnostic models (CDMs) provide pedagogically relevant information in the form of a student profile of multiple binary categorizations of students into mastery or nonmastery statuses on latent traits called attributes. Federal educational accountability requires accountability measures to designate students into one of at least three…
Descriptors: Accountability, Standards, Cutting Scores, Models
Tavares, Walter; Kinnear, Benjamin; Schumacher, Daniel J.; Forte, Milena – Advances in Health Sciences Education, 2023
In this perspective, the authors critically examine "rater training" as it has been conceptualized and used in medical education. By "rater training," they mean the educational events intended to "improve" rater performance and contributions during assessment events. Historically, rater training programs have focused…
Descriptors: Medical Education, Interrater Reliability, Evaluation Methods, Training
Victoria Reyes; Elizabeth Bogumil; Levin Elias Welch – Sociological Methods & Research, 2024
Transparency is once again a central issue of debate across types of qualitative research. Work on how to conduct qualitative data analysis, on the other hand, walks us through the step-by-step process on how to code and understand the data we've collected. Although there are a few exceptions, less focus is on transparency regarding…
Descriptors: Qualitative Research, Data Analysis, Guides, Databases
Stefanie A. Wind; Yuan Ge – Measurement: Interdisciplinary Research and Perspectives, 2024
Mixed-format assessments made up of multiple-choice (MC) items and constructed response (CR) items that are scored using rater judgments include unique psychometric considerations. When these item types are combined to estimate examinee achievement, information about the psychometric quality of each component can depend on that of the other. For…
Descriptors: Interrater Reliability, Test Bias, Multiple Choice Tests, Responses
Haiko Bruno Zimmermann; Debora Knihs; Raphael Sakugawa; Chris Bishop; Juliano Dal Pupo – Measurement in Physical Education and Exercise Science, 2024
Background: Measures that assess muscle strength and its development, either voluntarily or involuntarily, are important in the clinical and research context. The main aim of this study was to verify the interday reliability and the minimum detectable change (MDC) of the knee extensors muscles torque using evoked contractions and explosive…
Descriptors: Human Body, Physiology, Motor Reactions, Muscular Strength
William C. M. Belzak; Daniel J. Bauer – Journal of Educational and Behavioral Statistics, 2024
Testing for differential item functioning (DIF) has undergone rapid statistical developments recently. Moderated nonlinear factor analysis (MNLFA) allows for simultaneous testing of DIF among multiple categorical and continuous covariates (e.g., sex, age, ethnicity, etc.), and regularization has shown promising results for identifying DIF among…
Descriptors: Test Bias, Algorithms, Factor Analysis, Error of Measurement
Augustin Mutak; Robert Krause; Esther Ulitzsch; Sören Much; Jochen Ranger; Steffi Pohl – Journal of Educational Measurement, 2024
Understanding the intraindividual relation between an individual's speed and ability in testing scenarios is essential to assure a fair assessment. Different approaches exist for estimating this relationship, that either rely on specific study designs or on specific assumptions. This paper aims to add to the toolbox of approaches for estimating…
Descriptors: Testing, Academic Ability, Time on Task, Correlation
L.J.G. Krijnen; K. Greaves-Lord; W. Mandy; K.J.S. Mataw; P. Hartog; S. Begeer – Journal of Autism and Developmental Disorders, 2024
The current study evaluated a brief, informant-based autism interview: the Developmental, Dimensional and Diagnostic Interview -- Adult Version (3Di-Adult). Feasibility, reliability and validity of the Dutch 3Di-Adult was tested amongst autistic participants (n = 62) and a non-autistic comparison group (n = 30) in the Netherlands. The 3Di-Adult…
Descriptors: Autism Spectrum Disorders, Identification, Foreign Countries, Adults
Sidney Newton; Rui Wang – Educational Studies, 2024
Notwithstanding the neuromyth controversy, the malleability of learning style preferences impacts the validity of the measurement instrument and the effectiveness of the associated model of learning. This study investigates the test-retest reliability and underlying dynamics of Kolb's Learning Style Inventory (KLSI). It surveys 245 college-level…
Descriptors: Cognitive Style, Preferences, Reliability, Validity
Bronson Hui; Zhiyi Wu – Studies in Second Language Acquisition, 2024
A slowdown or a speedup in response times across experimental conditions can be taken as evidence of online deployment of knowledge. However, response-time difference measures are rarely evaluated on their reliability, and there is no standard practice to estimate it. In this article, we used three open data sets to explore an approach to…
Descriptors: Reliability, Reaction Time, Psychometrics, Criticism

Peer reviewed
Direct link
