ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	12
Since 2006 (last 20 years)	24

Descriptor

Interrater Reliability	29
Statistical Analysis	29
Test Reliability	29
Test Validity	13
Correlation	9
Foreign Countries	8
Psychometrics	7
Test Construction	6
Evaluation Methods	5
Generalizability Theory	5
Observation	5
Measures (Individuals)	4
Scores	4
Accuracy	3
College Students	3
Elementary School Students	3
Elementary School Teachers	3
Factor Analysis	3
Graduate Students	3
Interaction	3
Intervention	3
Interviews	3
Physical Education	3
Preservice Teachers	3
Pretests Posttests	3
More ▼

Publication Type

Journal Articles	24
Reports - Research	21
Reports - Evaluative	4
Guides - Non-Classroom	2
Reports - Descriptive	2
Tests/Questionnaires	2
Dissertations/Theses -…	1
Speeches/Meeting Papers	1

Education Level

Higher Education	10
Postsecondary Education	7
Elementary Education	6
Early Childhood Education	3
Elementary Secondary Education	3
Middle Schools	2
Preschool Education	2
Grade 1	1
Grade 2	1
Grade 3	1
Grade 5	1
Intermediate Grades	1
Primary Education	1
Secondary Education	1
More ▼

Audience

Practitioners	1
Researchers	1

Location

Cyprus	2
South Carolina	2
Idaho	1
Japan	1
Jordan	1
Netherlands	1
Netherlands (Amsterdam)	1
New Zealand	1
Pennsylvania	1
United Kingdom	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Advanced Placement…	1
Bracken Basic Concept Scale	1
SAT (College Admission Test)	1
Strengths and Difficulties…	1
Test of English as a Foreign…	1
Test of English for…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 29 results Save | Export

Using Differential Item Functioning to Test for Interrater Reliability in Constructed Response Items

Peer reviewed

Direct link

Walker, Cindy M.; Göçer Sahin, Sakine – Educational and Psychological Measurement, 2020

The purpose of this study was to investigate a new way of evaluating interrater reliability that can allow one to determine if two raters differ with respect to their rating on a polytomous rating scale or constructed response item. Specifically, differential item functioning (DIF) analyses were used to assess interrater reliability and compared…

Descriptors: Test Bias, Interrater Reliability, Responses, Correlation

Practices in Instrument Use and Development in "Chemistry Education Research and Practice" 2010-2021

Peer reviewed

Direct link

Lazenby, Katherine; Tenney, Kristin; Marcroft, Tina A.; Komperda, Regis – Chemistry Education Research and Practice, 2023

Assessment instruments that generate quantitative data on attributes (cognitive, affective, behavioral, "etc.") of participants are commonly used in the chemistry education community to draw conclusions in research studies or inform practice. Recently, articles and editorials have stressed the importance of providing evidence for the…

Descriptors: Chemistry, Periodicals, Journal Articles, Science Education

The Counseling Competencies Scale: Validation and Refinement

Peer reviewed

Direct link

Lambie, Glenn W.; Mullen, Patrick R.; Swank, Jacqueline M.; Blount, Ashley – Measurement and Evaluation in Counseling and Development, 2018

Supervisors evaluated counselors-in-training at multiple points during their practicum experience using the Counseling Competencies Scale (CCS; N = 1,070). The CCS evaluations were randomly split to conduct exploratory factor analysis and confirmatory factor analysis, resulting in a 2-factor model (61.5% of the variance explained).

Descriptors: Counselor Training, Counseling, Measures (Individuals), Competence

Inter-Rater and Test-Retest (Between-Sessions) Reliability of the 4-Skills Scan for Dutch Elementary School Children

Peer reviewed

Direct link

van Kernebeek, Willem G.; de Schipper, Antoine W.; Savelsbergh, Geert J. P.; Toussaint, Huub M. – Measurement in Physical Education and Exercise Science, 2018

In The Netherlands, the 4-Skills Scan is an instrument for physical education teachers to assess gross motor skills of elementary school children. Little is known about its reliability. Therefore, in this study the test-retest and inter-rater reliability was determined. Respectively, 624 and 557 Dutch 6- to 12-year-old children were analyzed for…

Descriptors: Foreign Countries, Interrater Reliability, Pretests Posttests, Psychomotor Skills

Generalizability Theory Research on Developing a Scoring Rubric to Assess Primary School Students' Problem Posing Skills

Peer reviewed

Direct link

Cankoy, Osman; Özder, Hasan – EURASIA Journal of Mathematics, Science & Technology Education, 2017

The aim of this study is to develop a scoring rubric to assess primary school students' problem posing skills. The rubric including five dimensions namely solvability, reasonability, mathematical structure, context and language was used. The raters scored the students' problem posing skills both with and without the scoring rubric to test the…

Descriptors: Generalizability Theory, Elementary School Students, Foreign Countries, Problem Solving

Measuring L2 Speakers' Interactional Ability Using Interactive Speech Tasks

Peer reviewed

Direct link

van Batenburg, Eline S. L.; Oostdam, Ron J.; van Gelderen, Amos J. S.; de Jong, Nivja H. – Language Testing, 2018

This article explores ways to assess interactional performance, and reports on the use of a test format that standardizes the interlocutor's linguistic and interactional contributions to the exchange. It describes the construction and administration of six scripted speech tasks (instruction, advice, and sales tasks) with pre-vocational learners (n…

Descriptors: Second Language Learning, Speech Tests, Interaction, Test Reliability

A Comparison between Students' Self-Assessment and Teachers' Assessment

Peer reviewed
PDF on ERIC

Download full text

Thawabieh, Ahmad M. – Journal of Curriculum and Teaching, 2017

This study aimed to compare between the students' self-assessment and teachers' assessment. The study sample consisted of 71 students at Tafila Technical University studying Introduction to Psychology course. The researcher used 2 students' self-assessment tools and 2 tests. The results indicated that students can assess themselves accurately if…

Descriptors: Comparative Analysis, Self Evaluation (Individuals), Student Evaluation, Psychology

Exploring the Reliability of Generic and Content-Specific Instructional Aspects in Physical Education Lessons

Peer reviewed

Direct link

Charalambous, Charalambos Y.; Kyriakides, Ermis; Tsangaridou, Niki; Kyriakides, Leonidas – School Effectiveness and School Improvement, 2017

Heightened accountability pressures and an increased emphasis on teaching quality have directed scholarly attention to scrutinizing instruction, particularly with respect to issues of validity and reliability. However, these attempts have largely been directed toward "core" content areas and investigated generic or content-specific…

Descriptors: Physical Education, Instructional Effectiveness, Lesson Plans, Interrater Reliability

Generalizability of Universal Screening Measures for Behavioral and Emotional Risk

Peer reviewed

Direct link

Tanner, Nicholas; Eklund, Katie; Kilgus, Stephen P.; Johnson, Austin H. – School Psychology Review, 2018

Data derived from universal screening procedures are increasingly utilized by schools to identify and provide additional support to students at risk for behavioral and emotional concerns. As screening has the potential to be resource intensive, effort has been placed on the development of efficient screening procedures, including brief behavior…

Descriptors: Screening Tests, At Risk Students, Behavior Problems, Emotional Problems

Development and Validation of the Written Communication Assessment of the "HEIghten"® Outcomes Assessment Suite. Research Report. ETS RR-17-53

Peer reviewed
PDF on ERIC

Download full text

Rios, Joseph A.; Sparks, Jesse R.; Zhang, Mo; Liu, Ou Lydia – ETS Research Report Series, 2017

Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack…

Descriptors: Test Construction, Test Validity, Writing Tests, College Outcomes Assessment

Examining the Reliability of Scores from the Consensual Assessment Technique in the Measurement of Individual and Small Group Creativity

Peer reviewed

Direct link

Stefanic, Nicholas; Randles, Clint – Music Education Research, 2015

The purpose of this study was to explore the reliability of measures of both individual and group creative work using the consensual assessment technique (CAT). CAT was used to measure individual and group creativity among a population of pre-service music teachers enrolled in a secondary general music class (n = 23) and was evaluated from…

Descriptors: Music Education, Creativity, Preservice Teachers, Music Teachers

Modifying the System for Observing Fitness Instruction Time to Measure Teacher Practices Related to Physical Activity Promotion: SOFIT+

Peer reviewed

Direct link

Weaver, R. Glenn; Webster, Collin A.; Erwin, Heather; Beighle, Aaron; Beets, Michael W.; Choukroun, Hadrien; Kaysing, Nicole – Measurement in Physical Education and Exercise Science, 2016

The System for Observing Fitness Instruction Time (SOFIT) is commonly used to measure variables related to physical activity during physical education (PE). However, SOFIT does not yield detailed information about teacher practices related to children's moderate-to-vigorous physical activity (MVPA). This study describes the modification of SOFIT…

Descriptors: Physical Education, Observation, Physical Activity Level, Teaching Methods

Measuring Rater Reliability on a Special Education Observation Tool

Peer reviewed

Direct link

Semmelroth, Carrie Lisa; Johnson, Evelyn – Assessment for Effective Intervention, 2014

This study used generalizability theory to measure reliability on the Recognizing Effective Special Education Teachers (RESET) observation tool designed to evaluate special education teacher effectiveness. At the time of this study, the RESET tool included three evidence-based instructional practices (direct, explicit instruction; whole-group…

Descriptors: Observation, Special Education Teachers, Teacher Effectiveness, Teacher Evaluation

Consistency of Supervisor and Peer Ratings of Assessment Interviews Conducted by Psychology Trainees

Peer reviewed

Direct link

Gonsalvez, Craig J.; Deane, Frank P.; Caputi, Peter – British Journal of Guidance & Counselling, 2016

Observation of counsellor skills through a one-way mirror, video or audio recording followed by supervisors and peers feedback is common in counsellor training. The nature and extent of agreement between supervisor-peer dyads are unclear. Using a standard scale, supervisors and peers rated 32 interviews by psychology trainees observed through a…

Descriptors: Interviews, Supervisory Methods, Trainees, Minimum Competency Testing

Preliminary Evidence of the Technical Adequacy of Additional Curriculum-Based Measures for Preschool Mathematics

Peer reviewed

Direct link

Polignano, Joy C.; Hojnoski, Robin L. – Assessment for Effective Intervention, 2012

There has been increased attention to the development of assessment measures for evaluating mathematical skills in young children in order to inform instruction and intervention. However, existing tools have focused primarily on number sense with little attention to other areas of mathematical thinking such as geometry and algebra. The purpose of…

Descriptors: Numeracy, Curriculum Based Assessment, Test Reliability, Test Validity

Previous Page | Next Page »

Pages: 1 | 2

Assessment for Effective…	2
Measurement in Physical…	2
AILACTE Journal	1
American Journal of Evaluation	1
British Journal of Guidance &…	1
Chemistry Education Research…	1
Contemporary Issues in…	1
ETS Research Report Series	1
EURASIA Journal of…	1
Educational and Psychological…	1
Focus on Autism and Other…	1
Journal of Curriculum and…	1
Journal of Information…	1
Journal of Positive Behavior…	1
Journal of Special Education…	1
Journal of Vocational…	1
Language Testing	1
Measurement and Evaluation in…	1
Music Education Research	1
ProQuest LLC	1
Regional Educational…	1
School Effectiveness and…	1
School Psychology Review	1
Thought Currents in English…	1
More ▼

Bodur, Yasar	2
Unal, Aslihan	2
Unal, Zafer	2
Beets, Michael W.	1
Beighle, Aaron	1
Blount, Ashley	1
Boller, Kimberly	1
Bridgeman, Brent	1
Cankoy, Osman	1
Caputi, Peter	1
Charalambous, Charalambos Y.	1
Choukroun, Hadrien	1
Cost, Hollie C.	1
Deane, Frank P.	1
Eklund, Katie	1
Erwin, Heather	1
Freeman, Greta G.	1
Gonsalvez, Craig J.	1
Greatorex, Jackie	1
Göçer Sahin, Sakine	1
Hojnoski, Robin L.	1
Johnson, Austin H.	1
Johnson, Evelyn	1
Johnson, Evelyn S.	1
More ▼