Publication Date
In 2025 | 27 |
Since 2024 | 95 |
Since 2021 (last 5 years) | 356 |
Since 2016 (last 10 years) | 878 |
Since 2006 (last 20 years) | 2091 |
Descriptor
Interrater Reliability | 3093 |
Foreign Countries | 642 |
Evaluation Methods | 501 |
Test Reliability | 498 |
Test Validity | 406 |
Correlation | 401 |
Scoring | 336 |
Comparative Analysis | 327 |
Scores | 321 |
Validity | 309 |
Student Evaluation | 301 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Researchers | 130 |
Practitioners | 42 |
Teachers | 22 |
Administrators | 11 |
Counselors | 3 |
Policymakers | 2 |
Location
Australia | 56 |
Turkey | 52 |
United Kingdom | 46 |
Canada | 45 |
Netherlands | 40 |
California | 37 |
China | 37 |
United States | 30 |
United Kingdom (England) | 24 |
Taiwan | 23 |
Japan | 22 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 3 |
Meets WWC Standards with or without Reservations | 3 |
Does not meet standards | 3 |
Catharine Lory; Emily Gregori – Behavioral Disorders, 2024
Systematic reviews of single-case experimental research (SCER) in special education often use the What Works Clearinghouse (WWC) Standards to assess the methodological rigor of studies within a given literature base. While significant changes were made between the two most recent versions of the WWC standards, no research to date has evaluated the…
Descriptors: Program Effectiveness, Standards, Evidence, Case Studies
Helen L. Long; Gordon Ramsay; Edina R. Bene; Pumpki Lei Su; Hyunjoo Yoo; Cheryl Klaiman; Stormi L. Pulver; Shana Richardson; Moira L. Pileggi; Natalie Brane; D. Kimbrough Oller – Autism: The International Journal of Research and Practice, 2024
This study explores vocal development as an early marker of autism, focusing on canonical babbling rate and onset, typically established by 7 months. Previous reports suggested delayed or reduced canonical babbling in infants later diagnosed with autism, but the story may be complicated. We present a prospective study on 44 infants later diagnosed…
Descriptors: Infants, Infant Behavior, Child Language, Oral Language
Stefanie A. Wind; Yangmeng Xu – Educational Assessment, 2024
We explored three approaches to resolving or re-scoring constructed-response items in mixed-format assessments: rater agreement, person fit, and targeted double scoring (TDS). We used a simulation study to consider how the three approaches impact the psychometric properties of student achievement estimates, with an emphasis on person fit. We found…
Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Examiners
Ramy Shabara; Khaled ElEbyary; Deena Boraie – Teaching English with Technology, 2024
Although there are claims that ChatGPT, an AI-based language model, is capable of assessing the writing of L2 learners accurately and consistently in the classroom, a number of recent studies have shown discrepancies between AI and human raters. Furthermore, there is a lack of studies investigating the intrareliability of ChatGPT scores.…
Descriptors: Foreign Countries, Artificial Intelligence, Scoring Rubrics, Student Evaluation
Melanie Livet; Caryn S. Ward; Amelia Krysinski – National Implementation Research Network, 2024
Rivet Education developed a Professional Learning Partner Guide (PLPG), a searchable database of learning providers with expertise in the adoption and implementation of High-Quality Instructional Materials (HQIM), in response to the lack of resources and guidance available to support districts with evaluation of professional learning (PL)…
Descriptors: Professional Development, Directories, Databases, Instructional Materials
Yaniv Biton – Educational Process: International Journal, 2025
Background/purpose: This study addresses the challenge of engaging students in meaningful assessment processes within mathematics education, particularly in pre-university preparatory courses. Peer assessment, a growing pedagogical practice, can enhance learning outcomes, promote critical thinking, and improve collaborative skills. However,…
Descriptors: Foreign Countries, Mathematics Education, College Preparation, College Bound Students
Iglesias Pérez, M. C.; Vidal-Puga, J.; Pino Juste, M. R. – Studies in Higher Education, 2022
Self-assessment and peer assignment have clear advantages for the training of responsible, critical, and reflective professionals. In recent years, self and peer evaluation have also been shown to be even more effective than lecturer evaluation when we assure anonymity through online platforms learning tools. Therefore, self and peer assessments…
Descriptors: Role, Self Evaluation (Individuals), Peer Evaluation, Formative Evaluation
Vuijk, Richard; Deen, Mathijs; Arntz, Arnoud; Geurts, Hilde M. – Journal of Autism and Developmental Disorders, 2022
For autism spectrum disorder (ASD) in adults there are several diagnostic instruments available with a need for consideration of the psychometric properties. This study aimed to conduct a first psychometric evaluation of a new diagnostic ASD instrument, the NIDA (Dutch Interview for Diagnostic assessment of ASD in adults) in 90 adult males without…
Descriptors: Foreign Countries, Clinical Diagnosis, Psychometrics, Autism
Uyar, Seyma; Yayla, Onur; Zunber, Hidayet – International Journal of Assessment Tools in Education, 2022
The purpose of the current study is to examine the map reading skills of Social Studies pre-service teachers with orienteering, which is an activity-based and more active practice. To this end, a total of 10 students attending the Department of Social Studies Teaching in the Education Faculty of Burdur Mehmet Akif Ersoy University and taking the…
Descriptors: Map Skills, Navigation, Item Response Theory, Social Studies
Thorne, Casey Lee – Journal of Dance Education, 2022
The research outlined in this article offers a systematic training methodology for students and licensed Traditional Chinese Medicine (TCM) practitioners to learn the clinical art and science of pulsology through dance. One of the greatest hurdles in learning pulse palpation is a TCM practitioner's inability to feel the pulse with a degree of…
Descriptors: Dance Education, Metabolism, Medicine, Asian Culture
Kilic, Abdullah Faruk; Uysal, Ibrahim – International Journal of Assessment Tools in Education, 2022
Most researchers investigate the corrected item-total correlation of items when analyzing item discrimination in multi-dimensional structures under the Classical Test Theory, which might lead to underestimating item discrimination, thereby removing items from the test. Researchers might investigate the corrected item-total correlation with the…
Descriptors: Item Analysis, Correlation, Item Response Theory, Test Items
Doewes, Afrizal; Kurdhi, Nughthoh Arfawi; Saxena, Akrati – International Educational Data Mining Society, 2023
Automated Essay Scoring (AES) tools aim to improve the efficiency and consistency of essay scoring by using machine learning algorithms. In the existing research work on this topic, most researchers agree that human-automated score agreement remains the benchmark for assessing the accuracy of machine-generated scores. To measure the performance of…
Descriptors: Essays, Writing Evaluation, Evaluators, Accuracy
Martinková, Patrícia; Bartoš, František; Brabec, Marek – Journal of Educational and Behavioral Statistics, 2023
Inter-rater reliability (IRR), which is a prerequisite of high-quality ratings and assessments, may be affected by contextual variables, such as the rater's or ratee's gender, major, or experience. Identification of such heterogeneity sources in IRR is important for the implementation of policies with the potential to decrease measurement error…
Descriptors: Interrater Reliability, Bayesian Statistics, Statistical Inference, Hierarchical Linear Modeling
Schmidt, Ellyn M.; Rothenberg, W. Andrew; Davidson, Bridget C.; Barnett, Miya; Jent, Jason; Cadenas, Heleny; Fernandez, Corina; Davis, Eileen – Journal of Behavioral Education, 2023
Measuring classroom behavior among young children is important to guide assessment and intervention decisions, yet there is limited literature on appropriate direct observation tools for this purpose. This article describes the psychometric properties of the Behavior Assessment System for Children, Student Observation System (BASC-3 SOS) with 135…
Descriptors: Young Children, Special Education, Child Behavior, Psychometrics
Toma, Radu Bogdan – Technology, Knowledge and Learning, 2023
The development of computational thinking skills is attracting attention worldwide. The use of visual or block-based coding in primary schools has gained momentum. Yet, students' acceptance of such coding environments has been neglected in the literature. This study presents a measurement instrument that will allow pursuing such an endeavor. The…
Descriptors: Computation, Thinking Skills, Coding, Measurement