Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 6 |
Since 2016 (last 10 years) | 20 |
Since 2006 (last 20 years) | 51 |
Descriptor
Correlation | 62 |
Interrater Reliability | 62 |
Test Reliability | 62 |
Test Validity | 38 |
Foreign Countries | 17 |
Psychometrics | 16 |
Evaluation Methods | 12 |
Scores | 12 |
Test Construction | 12 |
Children | 10 |
Measures (Individuals) | 10 |
More ▼ |
Source
Author
Anna-Maria Fall | 2 |
Beula M. Magimairaj | 2 |
Botting, Nicola | 2 |
Greg Roberts | 2 |
Hagiwara, Taku | 2 |
Ichikawa, Hironobu | 2 |
Inoue, Masahiko | 2 |
Kamio, Yoko | 2 |
Nakamura, Kazuhiko | 2 |
Philip Capin | 2 |
Ronald B. Gillam | 2 |
More ▼ |
Publication Type
Education Level
Higher Education | 10 |
Postsecondary Education | 7 |
Early Childhood Education | 4 |
Elementary Education | 4 |
Preschool Education | 4 |
Secondary Education | 3 |
Grade 1 | 2 |
Elementary Secondary Education | 1 |
Grade 3 | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
More ▼ |
Audience
Researchers | 3 |
Administrators | 1 |
Location
Netherlands | 5 |
Japan | 3 |
Australia | 2 |
Florida | 2 |
Italy | 2 |
Pennsylvania | 2 |
Sweden | 2 |
Turkey | 2 |
United Kingdom | 2 |
United Kingdom (England) | 2 |
Washington | 2 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Kelvin Terrell Pompey – ProQuest LLC, 2021
Many methods are used to measure interrater reliability for studies where each target receives ratings by a different set of judges. The purpose of this study is to explore the use of hierarchical modeling for estimating interrater reliability using the intraclass correlation coefficient. This study provides a description of how the ICC can be…
Descriptors: Interrater Reliability, Evaluation Methods, Test Reliability, Correlation
Using Differential Item Functioning to Test for Interrater Reliability in Constructed Response Items
Walker, Cindy M.; Göçer Sahin, Sakine – Educational and Psychological Measurement, 2020
The purpose of this study was to investigate a new way of evaluating interrater reliability that can allow one to determine if two raters differ with respect to their rating on a polytomous rating scale or constructed response item. Specifically, differential item functioning (DIF) analyses were used to assess interrater reliability and compared…
Descriptors: Test Bias, Interrater Reliability, Responses, Correlation
Venkatraman, Yamini; Mahalingam, Shenbagavalli; Boominathan, Prakash – Journal of Speech, Language, and Hearing Research, 2022
Purpose: The Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) is a standardized instrument used in voice assessment to assess voice quality. It has been translated and culturally adapted in several languages. This study aimed at developing and validating a Tamil version of CAPE-V through auditory perceptual evaluation of remotely…
Descriptors: Sentences, Dravidian Languages, Acoustics, Auditory Perception
Parker, David C.; Stewart, Lisa H.; Thomson, Susan; Kaminski, Ruth A. – Assessment for Effective Intervention, 2021
Vocabulary skills are important for overall reading competence, but vocabulary assessment approaches that inform instructional decision-making and are sensitive to improvement are limited. This article describes a process for developing vocabulary measures designed to facilitate data-driven decision-making for kindergarten and first-grade students…
Descriptors: Vocabulary, Kindergarten, Grade 1, Elementary School Students
Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Grantee Submission, 2022
Purpose: Our aim was to evaluate the psychometric properties of the online administered format of the Test of Narrative Language--Second Edition (TNL-2; Gillam & Pearson, 2017), given the importance of assessing children's narrative ability and considerable absence of psychometric studies of spoken language assessments administered online.…
Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments
Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Language, Speech, and Hearing Services in Schools, 2022
Purpose: Our aim was to evaluate the psychometric properties of the online administered format of the Test of Narrative Language--Second Edition (TNL-2; Gillam & Pearson, 2017), given the importance of assessing children's narrative ability and considerable absence of psychometric studies of spoken language assessments administered online.…
Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments
Szafran, Robert F. – Practical Assessment, Research & Evaluation, 2017
Institutional assessment of student learning objectives has become a fact-of-life in American higher education and the Association of American Colleges and Universities' (AAC&U) VALUE Rubrics have become a widely adopted evaluation and scoring tool for student work. As faculty from a variety of disciplines, some less familiar with the…
Descriptors: Interrater Reliability, Case Studies, Scoring Rubrics, Behavioral Objectives
Lambie, Glenn W.; Mullen, Patrick R.; Swank, Jacqueline M.; Blount, Ashley – Measurement and Evaluation in Counseling and Development, 2018
Supervisors evaluated counselors-in-training at multiple points during their practicum experience using the Counseling Competencies Scale (CCS; N = 1,070). The CCS evaluations were randomly split to conduct exploratory factor analysis and confirmatory factor analysis, resulting in a 2-factor model (61.5% of the variance explained).
Descriptors: Counselor Training, Counseling, Measures (Individuals), Competence
Benton, Stephen L.; Li, Dan – IDEA Center, Inc., 2018
This technical report describes the results of analyses performed on data collected from 2013 to 2017, using the IDEA Feedback System for Administrators (FSA). The FSA is used to gather impressions from core constituents about an administrator's performance of relevant administrative roles, as well as her/his leadership style, interpersonal…
Descriptors: Feedback (Response), Administrators, Administrator Attitudes, Administrator Role
Maxwell, Bruce; Boon, Helen; Tanchuk, Nicolas; Rauwerda, Bryan – Journal of Moral Education, 2021
This article documents the adaptation, piloting and validation of a measure of teachers' ethical sensitivity. To create the test, we modified a measure from dentistry drawing on literature in teacher professional ethics and drew on the expertise of professional ethics scholars and practitioners. Based on the results of Rasch analysis combined with…
Descriptors: Ethics, Moral Values, Scores, Teacher Education Programs
van Kernebeek, Willem G.; de Schipper, Antoine W.; Savelsbergh, Geert J. P.; Toussaint, Huub M. – Measurement in Physical Education and Exercise Science, 2018
In The Netherlands, the 4-Skills Scan is an instrument for physical education teachers to assess gross motor skills of elementary school children. Little is known about its reliability. Therefore, in this study the test-retest and inter-rater reliability was determined. Respectively, 624 and 557 Dutch 6- to 12-year-old children were analyzed for…
Descriptors: Foreign Countries, Interrater Reliability, Pretests Posttests, Psychomotor Skills
van Batenburg, Eline S. L.; Oostdam, Ron J.; van Gelderen, Amos J. S.; de Jong, Nivja H. – Language Testing, 2018
This article explores ways to assess interactional performance, and reports on the use of a test format that standardizes the interlocutor's linguistic and interactional contributions to the exchange. It describes the construction and administration of six scripted speech tasks (instruction, advice, and sales tasks) with pre-vocational learners (n…
Descriptors: Second Language Learning, Speech Tests, Interaction, Test Reliability
Thawabieh, Ahmad M. – Journal of Curriculum and Teaching, 2017
This study aimed to compare between the students' self-assessment and teachers' assessment. The study sample consisted of 71 students at Tafila Technical University studying Introduction to Psychology course. The researcher used 2 students' self-assessment tools and 2 tests. The results indicated that students can assess themselves accurately if…
Descriptors: Comparative Analysis, Self Evaluation (Individuals), Student Evaluation, Psychology
Tanner, Nicholas; Eklund, Katie; Kilgus, Stephen P.; Johnson, Austin H. – School Psychology Review, 2018
Data derived from universal screening procedures are increasingly utilized by schools to identify and provide additional support to students at risk for behavioral and emotional concerns. As screening has the potential to be resource intensive, effort has been placed on the development of efficient screening procedures, including brief behavior…
Descriptors: Screening Tests, At Risk Students, Behavior Problems, Emotional Problems
Rios, Joseph A.; Sparks, Jesse R.; Zhang, Mo; Liu, Ou Lydia – ETS Research Report Series, 2017
Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack…
Descriptors: Test Construction, Test Validity, Writing Tests, College Outcomes Assessment