Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 16 |
Descriptor
Interrater Reliability | 31 |
Test Reliability | 11 |
Scoring | 10 |
Test Validity | 10 |
Evaluation Methods | 7 |
Student Evaluation | 7 |
Test Items | 7 |
Tables (Data) | 6 |
Correlation | 5 |
Error of Measurement | 5 |
Test Construction | 5 |
More ▼ |
Source
Author
Benton, Stephen L. | 2 |
Allen, Nancy | 1 |
Alonzo, Julie | 1 |
Anderson, Daniel | 1 |
Angoff, William H. | 1 |
Arbaiy, Nurieze | 1 |
Baird, Gillian | 1 |
Bennett, Randy Elliot | 1 |
Bishop, Dorothy V. M. | 1 |
Boldt, R. F. | 1 |
Braswell, James | 1 |
More ▼ |
Publication Type
Numerical/Quantitative Data | 31 |
Reports - Research | 17 |
Reports - Evaluative | 8 |
Reports - Descriptive | 6 |
Tests/Questionnaires | 6 |
Journal Articles | 5 |
Education Level
Elementary Secondary Education | 8 |
Elementary Education | 3 |
Grade 5 | 3 |
Early Childhood Education | 2 |
Higher Education | 2 |
Postsecondary Education | 2 |
Grade 1 | 1 |
Grade 3 | 1 |
High Schools | 1 |
Intermediate Grades | 1 |
Kindergarten | 1 |
More ▼ |
Audience
Administrators | 2 |
Location
Florida | 2 |
New Mexico | 2 |
Washington | 2 |
Arizona | 1 |
Connecticut | 1 |
Georgia | 1 |
Malaysia | 1 |
Maryland | 1 |
North Carolina | 1 |
Oregon | 1 |
Pennsylvania | 1 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 1 |
Assessments and Surveys
Child Behavior Checklist | 1 |
Program for International… | 1 |
Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |
Regional Educational Laboratory Southeast, 2020
This document are the appendixes for the report, "The Reliability and Consequential Validity of Two Teacher-Administered Student Mathematics Diagnostic Assessments." Rather than relying on occasional testimonials from the field, decisions about using diagnostic assessments across the state should be based on psychometric data from an…
Descriptors: Mathematics Tests, Diagnostic Tests, Test Reliability, Test Validity
Benton, Stephen L.; Li, Dan – IDEA Center, Inc., 2018
This technical report describes the results of analyses performed on data collected from 2013 to 2017, using the IDEA Feedback System for Administrators (FSA). The FSA is used to gather impressions from core constituents about an administrator's performance of relevant administrative roles, as well as her/his leadership style, interpersonal…
Descriptors: Feedback (Response), Administrators, Administrator Attitudes, Administrator Role
Mustapha, Aida; Samsudin, Noor Azah; Arbaiy, Nurieze; Mohammed, Rozlini; Hamid, Isredza Rahmi – Turkish Online Journal of Educational Technology - TOJET, 2016
In programming, one problem can usually be solved using different logics and constructs but still producing the same output. Sometimes students get marked down inappropriately if their solutions do not follow the answer scheme. In addition, lab exercises and programming assignments are not necessary graded by the instructors but most of the time…
Descriptors: Programming, Computer Science Education, Scoring Rubrics, Grading
Smarter Balanced Assessment Consortium, 2016
The goal of this study was to gather comprehensive evidence about the alignment of the Smarter Balanced summative assessments to the Common Core State Standards (CCSS). Alignment of the Smarter Balanced summative assessments to the CCSS is a critical piece of evidence regarding the validity of inferences students, teachers and policy makers can…
Descriptors: Alignment (Education), Summative Evaluation, Common Core State Standards, Test Content
Martinkova, Patricia; Goldhaber, Dan – Center for Education Data & Research, 2015
Inter-rater reliability, commonly assessed by intra-class correlation coefficient ICC, is an important index for describing the extent to which there is consistency amongst two or more raters in assigned measures. In organizational research, the data structure is often hierarchical and designs deviate substantially from the ideal of a balanced…
Descriptors: Teacher Selection, Interrater Reliability, Public School Teachers, Hierarchical Linear Modeling
Benton, Stephen L.; Gross, Amy B.; Pallett, William H.; Song, Jihyun; Webster, Russell; Guo, Meixi – IDEA Center, Inc., 2011
The IDEA Feedback for Administrators system provides feedback to academic administrators about their performance of relevant administrative responsibilities and their leadership style and interpersonal characteristics. The system is based on a model of reflective practice, which is consistent with The IDEA Center's longstanding approach to…
Descriptors: Feedback (Response), Administrators, Administrator Attitudes, Administrator Evaluation
Hixson, Nate; Rhudy, Vaughn – West Virginia Department of Education, 2013
Student responses to the West Virginia Educational Standards Test (WESTEST) 2 Online Writing Assessment are scored by a computer-scoring engine. The scoring method is not widely understood among educators, and there exists a misperception that it is not comparable to hand scoring. To address these issues, the West Virginia Department of Education…
Descriptors: Scoring Formulas, Scoring Rubrics, Interrater Reliability, Test Scoring Machines
Harris, Milton E.; Tiedemann-Fuller, Meghan – Journal of Psychoeducational Assessment, 2010
A table is provided giving observed difference frequencies for caregiver versus teacher ratings of children on the Child Behavior Checklist and Teacher's Report Form Internalizing, Externalizing, and Total Problems scales per the original normative samples. The table permits accurate evaluation of the empirical rarity of specific cross-informant…
Descriptors: Check Lists, Performance Based Assessment, Test Validity, Child Behavior
Coe, Michael; Hanita, Makoto; Nishioka, Vicki; Smiley, Richard – National Center for Education Evaluation and Regional Assistance, 2011
The 6+1 Trait[R] Writing model (Culham 2003) emphasizes writing instruction in which teachers and students analyze writing using a set of characteristics, or "traits," of written work: ideas, organization, voice, word choice, sentence fluency, conventions, and presentation. The Ideas trait includes the main content and message, including…
Descriptors: Models, Writing Instruction, Instructional Effectiveness, Grade 5
Hitchcock, John; Dimino, Joseph; Kurki, Anja; Wilkins, Chuck; Gersten, Russell – National Center for Education Evaluation and Regional Assistance, 2011
Collaborative Strategic Reading (CSR) is a set of instructional strategies designed to improve the reading comprehension of students with diverse abilities (Klingner and Vaughn 1996). Teachers implement CSR at the classroom level using scaffolded instruction to guide students in the independent use of four comprehension strategies; students apply…
Descriptors: Reading Comprehension, Reading Strategies, Educational Strategies, Validity
Fiore, Thomas A.; Nimkoff, Tamara; Munk, Tom; Carlson, Elaine – National Center for Education Evaluation and Regional Assistance, 2013
The "Personnel Development Program to Improve Services and Results for Children with Disabilities" is authorized under Section 662 of the Individuals with Disabilities Education Act (IDEA) and is known as the Personnel Development Program (PDP). The PDP is administered by the U.S. Department of Education's (ED's) Office of Special…
Descriptors: Staff Development, Federal Programs, Program Evaluation, Special Education
Nese, Joseph F. T.; Lai, Cheng-Fei; Anderson, Daniel; Park, Bitnara Jasmine; Tindal, Gerald; Alonzo, Julie – Behavioral Research and Teaching, 2010
The purpose of this study was to examine the alignment of the easyCBM[R] mathematics benchmark and progress monitoring measures to the National Council of Teachers of Mathematics "Curriculum Focal Points" (NCTM, 2006). Based on Webb's alignment model (1997, 2002), we collected expert judgments on individual math items across a sampling of forms…
Descriptors: Academic Standards, Mathematics Teachers, Benchmarking, Research Reports
Boldt, R. F. – 1992
The Test of Spoken English (TSE) is an internationally administered instrument for assessing nonnative speakers' proficiency in speaking English. The research foundation of the TSE examination described in its manual refers to two sources of variation other than the achievement being measured: interrater reliability and internal consistency.…
Descriptors: Adults, Analysis of Variance, Interrater Reliability, Language Proficiency
New Mexico Public Education Department, 2007
The purpose of the NMSBA technical report is to provide users and other interested parties with a general overview of and technical characteristics of the 2007 NMSBA. The 2007 technical report contains the following information: (1) Test development; (2) Scoring procedures; (3) Summary of student performance; (4) Statistical analyses of item and…
Descriptors: Interrater Reliability, Standard Setting, Measures (Individuals), Scoring
Angoff, William H. – 1989
This study was undertaken to test the hypothesis that items of the Test of English as a Foreign Language (TOEFL) containing reference to American people, places, customs, etc., tend to favor examinees who have spent some time living in the United States. Two samples of examinees were drawn from the March 1987 TOEFL administration, one tested in…
Descriptors: Context Effect, English (Second Language), Evaluators, Foreign Nationals