NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 91 to 105 of 26,749 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Mirjam Sophia Glessmer; Rachel Forsyth – Teaching & Learning Inquiry, 2025
Generative AI tools (GenAI) are increasingly used for academic tasks, including qualitative data analysis for the Scholarship of Teaching and Learning (SoTL). In our practice as academic developers, we are frequently asked for advice on whether this use for GenAI is reliable, valid, and ethical. Since this is a new field, we have not been able to…
Descriptors: Artificial Intelligence, Research Methodology, Data Analysis, Scholarship
Peer reviewed Peer reviewed
Direct linkDirect link
Yangmeng Xu; Stefanie A. Wind – Educational Measurement: Issues and Practice, 2025
Double-scoring constructed-response items is a common but costly practice in mixed-format assessments. This study explored the impacts of Targeted Double-Scoring (TDS) and random double-scoring procedures on the quality of psychometric outcomes, including student achievement estimates, person fit, and student classifications under various…
Descriptors: Academic Achievement, Psychometrics, Scoring, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Juliana Reyes-Martin; David Simó-Pinatella; Ana Andrés – Journal of Applied Research in Intellectual Disabilities, 2025
Background: Behavioural problems in individuals with intellectual disabilities have a negative impact on them. Limited assessment measures exist in Spain. This study aimed to validate the Behavior Problems Inventory--Short Form (BPI-S) in the Spanish population by examining its psychometric properties and factorial structures. Method: This study…
Descriptors: Foreign Countries, Behavior Problems, Students with Disabilities, Intellectual Disability
Peer reviewed Peer reviewed
Direct linkDirect link
Alberto Gandolfi – International Journal of Artificial Intelligence in Education, 2025
In this paper, we initially investigate the capabilities of GPT-3 5 and GPT-4 in solving college-level calculus problems, an essential segment of mathematics that remains under-explored so far. Although improving upon earlier versions, GPT-4 attains approximately 65% accuracy for standard problems and decreases to 20% for competition-like…
Descriptors: Artificial Intelligence, Reliability, Problem Solving, Mathematics Skills
Peer reviewed Peer reviewed
Direct linkDirect link
Abdullah Faruk Kiliç; Meltem Acar Güvendir; Gül Güler; Tugay Kaçak – Measurement: Interdisciplinary Research and Perspectives, 2025
In this study, the extent to wording effects impact structure and factor loadings, internal consistency and measurement invariance was outlined. The modified form, which includes items that semantically reversed, explains %21.5 more variance than the original form. Also, reversed items' factor loadings are higher. As a result of CFA, indexes…
Descriptors: Test Items, Factor Structure, Test Reliability, Semantics
Peer reviewed Peer reviewed
Direct linkDirect link
Alexandra Jackson; Cheryl Bodnar; Elise Barrella; Juan Cruz; Krista Kecskemety – Journal of STEM Education: Innovations and Research, 2025
Recent curricular interventions in engineering education have focused on encouraging students to develop an entrepreneurial mindset (EM) to equip them with the skills needed to generate innovative ideas and address complex global problems upon entering the workforce. Methods to evaluate these interventions have been inconsistent due to the lack of…
Descriptors: Engineering Education, Entrepreneurship, Concept Mapping, Student Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Brogan L. Barr; Virginia V. W. McIntosh; Eileen F. Britt; Jennifer Jordan; Janet D. Carter – Measurement: Interdisciplinary Research and Perspectives, 2024
Even when raters demonstrate agreement in the use of a measure, limited score variability or violation of often-ignored statistical assumptions can result in lower reliability estimates than intuitively expected. This article uses data drawn from two randomized controlled trials of schema therapy and cognitive behavioral therapy for the treatment…
Descriptors: Evaluators, Interrater Reliability, Reliability, Measurement Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Pereira, Valerie J.; Tuomainen, Jyrki; Lee, Kathy Y. S.; Tong, Michael C. F.; Sell, Debbie A. – International Journal of Language & Communication Disorders, 2021
Background: The status of the velopharyngeal mechanism can be inferred from perceptual ratings of specified speech parameters. Several studies have proposed the measure of an overall velopharyngeal composite score based on these perceptual ratings and have reported good validity. The Cleft Audit Protocol for Speech--Augmented (CAPS-A) is a…
Descriptors: Congenital Impairments, Speech Tests, Outcome Measures, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Victoria Reyes; Elizabeth Bogumil; Levin Elias Welch – Sociological Methods & Research, 2024
Transparency is once again a central issue of debate across types of qualitative research. Work on how to conduct qualitative data analysis, on the other hand, walks us through the step-by-step process on how to code and understand the data we've collected. Although there are a few exceptions, less focus is on transparency regarding…
Descriptors: Qualitative Research, Data Analysis, Guides, Databases
Peer reviewed Peer reviewed
Direct linkDirect link
Stefanie A. Wind; Yuan Ge – Measurement: Interdisciplinary Research and Perspectives, 2024
Mixed-format assessments made up of multiple-choice (MC) items and constructed response (CR) items that are scored using rater judgments include unique psychometric considerations. When these item types are combined to estimate examinee achievement, information about the psychometric quality of each component can depend on that of the other. For…
Descriptors: Interrater Reliability, Test Bias, Multiple Choice Tests, Responses
Peer reviewed Peer reviewed
Direct linkDirect link
Haiko Bruno Zimmermann; Debora Knihs; Raphael Sakugawa; Chris Bishop; Juliano Dal Pupo – Measurement in Physical Education and Exercise Science, 2024
Background: Measures that assess muscle strength and its development, either voluntarily or involuntarily, are important in the clinical and research context. The main aim of this study was to verify the interday reliability and the minimum detectable change (MDC) of the knee extensors muscles torque using evoked contractions and explosive…
Descriptors: Human Body, Physiology, Motor Reactions, Muscular Strength
Peer reviewed Peer reviewed
Direct linkDirect link
William C. M. Belzak; Daniel J. Bauer – Journal of Educational and Behavioral Statistics, 2024
Testing for differential item functioning (DIF) has undergone rapid statistical developments recently. Moderated nonlinear factor analysis (MNLFA) allows for simultaneous testing of DIF among multiple categorical and continuous covariates (e.g., sex, age, ethnicity, etc.), and regularization has shown promising results for identifying DIF among…
Descriptors: Test Bias, Algorithms, Factor Analysis, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Augustin Mutak; Robert Krause; Esther Ulitzsch; Sören Much; Jochen Ranger; Steffi Pohl – Journal of Educational Measurement, 2024
Understanding the intraindividual relation between an individual's speed and ability in testing scenarios is essential to assure a fair assessment. Different approaches exist for estimating this relationship, that either rely on specific study designs or on specific assumptions. This paper aims to add to the toolbox of approaches for estimating…
Descriptors: Testing, Academic Ability, Time on Task, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
L.J.G. Krijnen; K. Greaves-Lord; W. Mandy; K.J.S. Mataw; P. Hartog; S. Begeer – Journal of Autism and Developmental Disorders, 2024
The current study evaluated a brief, informant-based autism interview: the Developmental, Dimensional and Diagnostic Interview -- Adult Version (3Di-Adult). Feasibility, reliability and validity of the Dutch 3Di-Adult was tested amongst autistic participants (n = 62) and a non-autistic comparison group (n = 30) in the Netherlands. The 3Di-Adult…
Descriptors: Autism Spectrum Disorders, Identification, Foreign Countries, Adults
Peer reviewed Peer reviewed
Direct linkDirect link
Sidney Newton; Rui Wang – Educational Studies, 2024
Notwithstanding the neuromyth controversy, the malleability of learning style preferences impacts the validity of the measurement instrument and the effectiveness of the associated model of learning. This study investigates the test-retest reliability and underlying dynamics of Kolb's Learning Style Inventory (KLSI). It surveys 245 college-level…
Descriptors: Cognitive Style, Preferences, Reliability, Validity
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  1784