Publication Date
| In 2026 | 0 |
| Since 2025 | 60 |
| Since 2022 (last 5 years) | 286 |
| Since 2017 (last 10 years) | 782 |
| Since 2007 (last 20 years) | 2044 |
Descriptor
| Interrater Reliability | 3126 |
| Foreign Countries | 655 |
| Test Reliability | 504 |
| Evaluation Methods | 503 |
| Test Validity | 411 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 25 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Zhang, Mo – ETS Research Report Series, 2013
Many testing programs use automated scoring to grade essays. One issue in automated essay scoring that has not been examined adequately is population invariance and its causes. The primary purpose of this study was to investigate the impact of sampling in model calibration on population invariance of automated scores. This study analyzed scores…
Descriptors: Automation, Scoring, Essay Tests, Sampling
Skaggs, Gary – Measurement: Interdisciplinary Research and Perspectives, 2013
The construct map is a particularly good way to approach instrument development, and this author states that he was delighted to read Adam Wyse's thoughts about how to use construct maps for standard setting. For a number of popular standard-setting methods, Wyse shows how typical feedback to panelists fits within a construct map framework.…
Descriptors: Standard Setting (Scoring), Maps, Test Construction, Measurement
Matson, Johnny L.; Horovitz, Max; Mahan, Sara; Fodstad, Jill – Research in Autism Spectrum Disorders, 2013
The purpose of this paper was to update the psychometrics of the "Matson Evaluation of Social Skills for Youngsters" ("MESSY") with children with Autism Spectrum Disorders (ASD), specifically with respect to internal consistency, split-half reliability, and inter-rater reliability. In Study 1, 114 children with ASD (Autistic Disorder, Asperger's…
Descriptors: Autism, Asperger Syndrome, Psychometrics, Caregivers
Boris, Ashley L.; Awadalla, Nardeen; Martin, Toby L.; Martin, Garry L.; Kaminski, Lauren; Miljkovic, Morena – Education and Training in Autism and Developmental Disabilities, 2015
The Assessment of Basic Learning Abilities (ABLA) is a tool that is used to assess the learning ability of individuals with intellectual disability (ID) and children with autism. The ABLA was recently revised and is now referred to as the ABLA-Revised (ABLA-R). A self-instructional manual was prepared to teach individuals how to administer the…
Descriptors: Guides, Academic Ability, Intellectual Disability, Autism
Rahn, Naomi L.; Wilson, Jennifer; Egan, Andrea; Brandes, Dana; Kunkel, Amy; Peterson, Meredith; McComas, Jennifer – Education and Treatment of Children, 2015
This study examined the effects of incremental rehearsal (IR) on letter sound expression for one kindergarten and one first grade English learner who were below district benchmark for letter sound fluency. A single-subject multiple-baseline design across sets of unknown letter sounds was used to evaluate the effect of IR on letter-sound expression…
Descriptors: English Language Learners, Kindergarten, Young Children, Grade 1
Jones, Ian; Inglis, Matthew – Educational Studies in Mathematics, 2015
School mathematics examination papers are typically dominated by short, structured items that fail to assess sustained reasoning or problem solving. A contributory factor to this situation is the need for student work to be marked reliably by a large number of markers of varied experience and competence. We report a study that tested an…
Descriptors: Problem Solving, Mathematics Instruction, Mathematics Tests, Test Items
Reese, R. Matthew; Jamison, T. Rene; Braun, Matt; Wendland, Maura; Black, William; Hadorn, Megan; Nelson, Eve-Lynn; Prather, Carole – Journal of Autism and Developmental Disorders, 2015
Children living in rural and underserved areas experience decreased access to health care services and are often diagnosed with autism at a later age compared to those living in urban or suburban areas. This study examines the utility and validity of an ASD assessment protocol conducted via video conferencing (VC). Participants (n = 17) included…
Descriptors: Rural Areas, Access to Health Care, Autism, Clinical Diagnosis
Rolider, Natalie U.; Iwata, Brian A.; Bullock, Christopher E. – Journal of Applied Behavior Analysis, 2012
We examined the effects of several variations in response rate on the calculation of total, interval, exact-agreement, and proportional reliability indices. Trained observers recorded computer-generated data that appeared on a computer screen. In Study 1, target responses occurred at low, moderate, and high rates during separate sessions so that…
Descriptors: Computation, Interrater Reliability, Intervals, Reliability
Brock, Matthew E.; Biggs, Elizabeth E.; Carter, Erik W.; Cattey, Gillian N.; Raley, Kevin S. – Journal of Special Education, 2016
Although research suggests peer support arrangements can be an effective practice for improving social outcomes for students with severe disabilities, additional efforts are needed to refine training and implementation approaches to increase the replicability and sustainability of this intervention. We tested a promising teacher-delivered training…
Descriptors: Middle School Students, Severe Disabilities, Inclusion, Special Education
Barton, Erin E.; Fuller, Elizabeth A.; Schnitz, Alana – Topics in Early Childhood Special Education, 2016
The purpose of this study was to examine the impact of performance feedback on preservice teachers' use of recommended practices within inclusive early childhood classrooms. A multiple baseline design across behaviors was used to examine the relation between performance feedback delivered via email and practicum students' use of target-recommended…
Descriptors: Feedback (Response), Faculty Development, Early Childhood Education, Preschool Teachers
Smarter Balanced Assessment Consortium, 2016
The goal of this study was to gather comprehensive evidence about the alignment of the Smarter Balanced summative assessments to the Common Core State Standards (CCSS). Alignment of the Smarter Balanced summative assessments to the CCSS is a critical piece of evidence regarding the validity of inferences students, teachers and policy makers can…
Descriptors: Alignment (Education), Summative Evaluation, Common Core State Standards, Test Content
Felderman, Theresa A. – Journal of College Teaching & Learning, 2014
Interteaching has shown to be an effective alternative to traditional lecture in a number of studies, but thorough analyses of its components, including frequent exams, is limited. Research suggests that increasing the frequency of exams may improve student learning. This study assessed the effectiveness of interteaching's frequent exams component…
Descriptors: Community Colleges, Tests, Lecture Method, Academic Achievement
Güler, Nese – Eurasian Journal of Educational Research, 2014
Problem Statement: The most significant disadvantage of open-ended items that allow the valid measurement of upper level cognitive behaviours, such as synthesis and evaluation, is scoring. The difficulty associated with objectively scoring the answers to the items contributes to the reduction of the reliability of the scores. Moreover, other…
Descriptors: Item Response Theory, Statistics, Scoring, Reliability
Kan, Adnan; Bulut, Okan – Eurasian Journal of Educational Research, 2014
Problem Statement: Performance assessments have emerged as an alternative method to measure what a student knows and can do. One of the shortcomings of performance assessments is the subjectivity and inconsistency of raters in scoring. A common criticism of performance assessments is the subjective nature of scoring procedures. The effectiveness…
Descriptors: Performance Based Assessment, Scoring Rubrics, Models, Experienced Teachers
Moser, Gary P.; Sudweeks, Richard R.; Morrison, Timothy G.; Wilcox, Brad – Reading Psychology, 2014
This study examined ratings of fourth graders' oral reading expression. Randomly assigned participants (n = 36) practiced repeated readings using narrative or informational passages for 7 weeks. After this period raters used the "Multidimensional Fluency Scale" (MFS) on two separate occasions to rate students' expressive…
Descriptors: Elementary School Students, Oral Reading, Reading Skills, Suprasegmentals

Peer reviewed
Direct link
