NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 29 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Walker, Cindy M.; Göçer Sahin, Sakine – Educational and Psychological Measurement, 2020
The purpose of this study was to investigate a new way of evaluating interrater reliability that can allow one to determine if two raters differ with respect to their rating on a polytomous rating scale or constructed response item. Specifically, differential item functioning (DIF) analyses were used to assess interrater reliability and compared…
Descriptors: Test Bias, Interrater Reliability, Responses, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Lazenby, Katherine; Tenney, Kristin; Marcroft, Tina A.; Komperda, Regis – Chemistry Education Research and Practice, 2023
Assessment instruments that generate quantitative data on attributes (cognitive, affective, behavioral, "etc.") of participants are commonly used in the chemistry education community to draw conclusions in research studies or inform practice. Recently, articles and editorials have stressed the importance of providing evidence for the…
Descriptors: Chemistry, Periodicals, Journal Articles, Science Education
Peer reviewed Peer reviewed
Direct linkDirect link
Lambie, Glenn W.; Mullen, Patrick R.; Swank, Jacqueline M.; Blount, Ashley – Measurement and Evaluation in Counseling and Development, 2018
Supervisors evaluated counselors-in-training at multiple points during their practicum experience using the Counseling Competencies Scale (CCS; N = 1,070). The CCS evaluations were randomly split to conduct exploratory factor analysis and confirmatory factor analysis, resulting in a 2-factor model (61.5% of the variance explained).
Descriptors: Counselor Training, Counseling, Measures (Individuals), Competence
Peer reviewed Peer reviewed
Direct linkDirect link
van Kernebeek, Willem G.; de Schipper, Antoine W.; Savelsbergh, Geert J. P.; Toussaint, Huub M. – Measurement in Physical Education and Exercise Science, 2018
In The Netherlands, the 4-Skills Scan is an instrument for physical education teachers to assess gross motor skills of elementary school children. Little is known about its reliability. Therefore, in this study the test-retest and inter-rater reliability was determined. Respectively, 624 and 557 Dutch 6- to 12-year-old children were analyzed for…
Descriptors: Foreign Countries, Interrater Reliability, Pretests Posttests, Psychomotor Skills
Peer reviewed Peer reviewed
Direct linkDirect link
Cankoy, Osman; Özder, Hasan – EURASIA Journal of Mathematics, Science & Technology Education, 2017
The aim of this study is to develop a scoring rubric to assess primary school students' problem posing skills. The rubric including five dimensions namely solvability, reasonability, mathematical structure, context and language was used. The raters scored the students' problem posing skills both with and without the scoring rubric to test the…
Descriptors: Generalizability Theory, Elementary School Students, Foreign Countries, Problem Solving
Peer reviewed Peer reviewed
Direct linkDirect link
van Batenburg, Eline S. L.; Oostdam, Ron J.; van Gelderen, Amos J. S.; de Jong, Nivja H. – Language Testing, 2018
This article explores ways to assess interactional performance, and reports on the use of a test format that standardizes the interlocutor's linguistic and interactional contributions to the exchange. It describes the construction and administration of six scripted speech tasks (instruction, advice, and sales tasks) with pre-vocational learners (n…
Descriptors: Second Language Learning, Speech Tests, Interaction, Test Reliability
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Thawabieh, Ahmad M. – Journal of Curriculum and Teaching, 2017
This study aimed to compare between the students' self-assessment and teachers' assessment. The study sample consisted of 71 students at Tafila Technical University studying Introduction to Psychology course. The researcher used 2 students' self-assessment tools and 2 tests. The results indicated that students can assess themselves accurately if…
Descriptors: Comparative Analysis, Self Evaluation (Individuals), Student Evaluation, Psychology
Peer reviewed Peer reviewed
Direct linkDirect link
Charalambous, Charalambos Y.; Kyriakides, Ermis; Tsangaridou, Niki; Kyriakides, Leonidas – School Effectiveness and School Improvement, 2017
Heightened accountability pressures and an increased emphasis on teaching quality have directed scholarly attention to scrutinizing instruction, particularly with respect to issues of validity and reliability. However, these attempts have largely been directed toward "core" content areas and investigated generic or content-specific…
Descriptors: Physical Education, Instructional Effectiveness, Lesson Plans, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Tanner, Nicholas; Eklund, Katie; Kilgus, Stephen P.; Johnson, Austin H. – School Psychology Review, 2018
Data derived from universal screening procedures are increasingly utilized by schools to identify and provide additional support to students at risk for behavioral and emotional concerns. As screening has the potential to be resource intensive, effort has been placed on the development of efficient screening procedures, including brief behavior…
Descriptors: Screening Tests, At Risk Students, Behavior Problems, Emotional Problems
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Rios, Joseph A.; Sparks, Jesse R.; Zhang, Mo; Liu, Ou Lydia – ETS Research Report Series, 2017
Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack…
Descriptors: Test Construction, Test Validity, Writing Tests, College Outcomes Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Stefanic, Nicholas; Randles, Clint – Music Education Research, 2015
The purpose of this study was to explore the reliability of measures of both individual and group creative work using the consensual assessment technique (CAT). CAT was used to measure individual and group creativity among a population of pre-service music teachers enrolled in a secondary general music class (n = 23) and was evaluated from…
Descriptors: Music Education, Creativity, Preservice Teachers, Music Teachers
Peer reviewed Peer reviewed
Direct linkDirect link
Weaver, R. Glenn; Webster, Collin A.; Erwin, Heather; Beighle, Aaron; Beets, Michael W.; Choukroun, Hadrien; Kaysing, Nicole – Measurement in Physical Education and Exercise Science, 2016
The System for Observing Fitness Instruction Time (SOFIT) is commonly used to measure variables related to physical activity during physical education (PE). However, SOFIT does not yield detailed information about teacher practices related to children's moderate-to-vigorous physical activity (MVPA). This study describes the modification of SOFIT…
Descriptors: Physical Education, Observation, Physical Activity Level, Teaching Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Semmelroth, Carrie Lisa; Johnson, Evelyn – Assessment for Effective Intervention, 2014
This study used generalizability theory to measure reliability on the Recognizing Effective Special Education Teachers (RESET) observation tool designed to evaluate special education teacher effectiveness. At the time of this study, the RESET tool included three evidence-based instructional practices (direct, explicit instruction; whole-group…
Descriptors: Observation, Special Education Teachers, Teacher Effectiveness, Teacher Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Gonsalvez, Craig J.; Deane, Frank P.; Caputi, Peter – British Journal of Guidance & Counselling, 2016
Observation of counsellor skills through a one-way mirror, video or audio recording followed by supervisors and peers feedback is common in counsellor training. The nature and extent of agreement between supervisor-peer dyads are unclear. Using a standard scale, supervisors and peers rated 32 interviews by psychology trainees observed through a…
Descriptors: Interviews, Supervisory Methods, Trainees, Minimum Competency Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Polignano, Joy C.; Hojnoski, Robin L. – Assessment for Effective Intervention, 2012
There has been increased attention to the development of assessment measures for evaluating mathematical skills in young children in order to inform instruction and intervention. However, existing tools have focused primarily on number sense with little attention to other areas of mathematical thinking such as geometry and algebra. The purpose of…
Descriptors: Numeracy, Curriculum Based Assessment, Test Reliability, Test Validity
Previous Page | Next Page »
Pages: 1  |  2