NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 59 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Stefanie A. Wind; Yangmeng Xu – Educational Assessment, 2024
We explored three approaches to resolving or re-scoring constructed-response items in mixed-format assessments: rater agreement, person fit, and targeted double scoring (TDS). We used a simulation study to consider how the three approaches impact the psychometric properties of student achievement estimates, with an emphasis on person fit. We found…
Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Examiners
Peer reviewed Peer reviewed
Direct linkDirect link
Zahn, Daniela; Canton, Ursula; Boyd, Victoria; Hamilton, Laura; Mamo, Josianne; McKay, Jane; Proudfoot, Linda; Telfer, Dickson; Williams, Kim; Wilson, Colin – Studies in Higher Education, 2021
Evaluating the impact of Academic Literacies teaching (Lea and Street [1998. "Student Writing in Higher Education: An Academic Literacies Approach." "Studies in Higher Education" 23 (2): 157-72. doi:10.1080/03075079812331380364]) is difficult, as it involves gauging whether writers: (1) gain better understanding of what…
Descriptors: Writing Evaluation, Evaluation Methods, Undergraduate Students, Foreign Countries
Kelvin Terrell Pompey – ProQuest LLC, 2021
Many methods are used to measure interrater reliability for studies where each target receives ratings by a different set of judges. The purpose of this study is to explore the use of hierarchical modeling for estimating interrater reliability using the intraclass correlation coefficient. This study provides a description of how the ICC can be…
Descriptors: Interrater Reliability, Evaluation Methods, Test Reliability, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Saito, Kazuya; Macmillan, Konstantinos; Kachlicka, Magdalena; Kunihara, Takuya; Minematsu, Nobuaki – Studies in Second Language Acquisition, 2023
Whereas many scholars have emphasized the relative importance of "comprehensibility" as an ecologically valid goal for L2 speech training, testing, and development, eliciting listeners' judgments is time-consuming. Following calls for research on more efficient L2 speech rating methods in applied linguistics, and growing attention toward…
Descriptors: Second Language Learning, Second Language Instruction, Interrater Reliability, Speech Communication
Peer reviewed Peer reviewed
Direct linkDirect link
Looney, Marilyn A. – Measurement in Physical Education and Exercise Science, 2018
The purpose of this article was two-fold (1) provide an overview of the commonly reported and under-reported absolute agreement indices in the kinesiology literature for continuous data; and (2) present examples of these indices for hypothetical data along with recommendations for future use. It is recommended that three types of information be…
Descriptors: Interrater Reliability, Evaluation Methods, Kinetics, Indexes
Peer reviewed Peer reviewed
Direct linkDirect link
Leaman, Marion C.; Edmonds, Lisa A. – Journal of Speech, Language, and Hearing Research, 2021
Purpose: This study evaluated interrater reliability (IRR) and test-retest stability (TRTS) of seven linguistic measures (percent correct information units, relevance, subject-verb-[object], complete utterance, grammaticality, referential cohesion, global coherence), and communicative success in unstructured conversation and in a story narrative…
Descriptors: Aphasia, Psychometrics, Correlation, Speech Language Pathology
Bejarano, Carolina M.; Snow, Kelli; Lane, Hannah; Calvert, Hannah; Hoppe, Kate; Alfonsin, Nicole; Turner, Lindsey; Carlson, Jordan A. – Grantee Submission, 2019
Purpose: This study presents a novel methodology/process for assessing inclusion of theoretically-based implementation factors within available adoption-ready health promotion programs. Methods: Classroom-based physical activity (CBPA) programs were used as an example to describe the process. Our team selected an implementation science framework…
Descriptors: Evaluation Methods, Program Evaluation, Health Promotion, Physical Activity Level
Peer reviewed Peer reviewed
Direct linkDirect link
Tavares, Walter; Brydges, Ryan; Myre, Paul; Prpic, Jason; Turner, Linda; Yelle, Richard; Huiskamp, Maud – Advances in Health Sciences Education, 2018
Assessment of clinical competence is complex and inference based. Trustworthy and defensible assessment processes must have favourable evidence of validity, particularly where decisions are considered high stakes. We aimed to organize, collect and interpret validity evidence for a high stakes simulation based assessment strategy for certifying…
Descriptors: Competence, Simulation, Allied Health Personnel, Certification
Peer reviewed Peer reviewed
Direct linkDirect link
Guo, Xiuyan; Lei, Pui-Wa – International Journal of Testing, 2020
Little research has been done on the effects of peer raters' quality characteristics on peer rating qualities. This study aims to address this gap and investigate the effects of key variables related to peer raters' qualities, including content knowledge, previous rating experience, training on rating tasks, and rating motivation. In an experiment…
Descriptors: Peer Evaluation, Error Patterns, Correlation, Knowledge Level
Peer reviewed Peer reviewed
Direct linkDirect link
Lambie, Glenn W.; Mullen, Patrick R.; Swank, Jacqueline M.; Blount, Ashley – Measurement and Evaluation in Counseling and Development, 2018
Supervisors evaluated counselors-in-training at multiple points during their practicum experience using the Counseling Competencies Scale (CCS; N = 1,070). The CCS evaluations were randomly split to conduct exploratory factor analysis and confirmatory factor analysis, resulting in a 2-factor model (61.5% of the variance explained).
Descriptors: Counselor Training, Counseling, Measures (Individuals), Competence
Kankaraš, Miloš; Feron, Eva; Renbarger, Rachel – OECD Publishing, 2019
Triangulation -- a combined use of different assessment methods or sources to evaluate psychological constructs -- is still a rarely used assessment approach in spite of its potential in overcoming inherent constraints of individual assessment methods. This paper uses field test data from a new OECD Study on Social and Emotional Skills to examine…
Descriptors: Interpersonal Competence, Emotional Intelligence, Evaluation Methods, Student Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Kuiken, Folkert; Vedder, Ineke – Language Testing, 2017
The importance of functional adequacy as an essential component of L2 proficiency has been observed by several authors (Pallotti, 2009; De Jong, Steinel, Florijn, Schoonen, & Hulstijn, 2012a, b). The rationale underlying the present study is that the assessment of writing proficiency in L2 is not fully possible without taking into account the…
Descriptors: Second Language Learning, Rating Scales, Computational Linguistics, Persuasive Discourse
Dockterman, Daniel Milo – ProQuest LLC, 2017
Student surveys have gained prominence in recent years as a way to give students a voice in their learning process, and teacher self-reports have always been an effective instrument for revealing the planning, intentions, and expectations behind a given lesson. Though student and teacher surveys are widely used, extant research in education has…
Descriptors: Outcome Measures, Teacher Evaluation, Student Evaluation of Teacher Performance, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Stefanic, Nicholas; Randles, Clint – Music Education Research, 2015
The purpose of this study was to explore the reliability of measures of both individual and group creative work using the consensual assessment technique (CAT). CAT was used to measure individual and group creativity among a population of pre-service music teachers enrolled in a secondary general music class (n = 23) and was evaluated from…
Descriptors: Music Education, Creativity, Preservice Teachers, Music Teachers
Peer reviewed Peer reviewed
Direct linkDirect link
Pijl, Mirjam K. J.; Rommelse, Nanda N. J.; Hendriks, Monica; De Korte, Manon W. P.; Buitelaar, Jan K.; Oosterling, Iris J. – Autism: The International Journal of Research and Practice, 2018
The field of early autism research is in dire need of outcome measures that adequately reflect subtle changes in core autistic behaviors. This article compares the ability of a newly developed measure, the Brief Observation of Social Communication Change (BOSCC), and the Autism Diagnostic Observation Schedule (ADOS) to detect changes in core…
Descriptors: Intervention, Autism, Interpersonal Communication, Interrater Reliability
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4