NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 106 to 120 of 3,093 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Jonathan K. Foster; Peter Youngs; Rachel van Aswegen; Samarth Singh; Ginger S. Watson; Scott T. Acton – Journal of Learning Analytics, 2024
Despite a tremendous increase in the use of video for conducting research in classrooms as well as preparing and evaluating teachers, there remain notable challenges to using classroom videos at scale, including time and financial costs. Recent advances in artificial intelligence could make the process of analyzing, scoring, and cataloguing videos…
Descriptors: Learning Analytics, Automation, Classification, Artificial Intelligence
Peer reviewed Peer reviewed
Direct linkDirect link
Taylor, Tessa; Lanovaz, Marc J. – Journal of Applied Behavior Analysis, 2022
Behavior analysts typically rely on visual inspection of single-case experimental designs to make treatment decisions. However, visual inspection is subjective, which has led to the development of supplemental objective methods such as the conservative dual-criteria method. To replicate and extend a study conducted by Wolfe et al. (2018) on the…
Descriptors: Visual Perception, Artificial Intelligence, Decision Making, Evaluators
Peer reviewed Peer reviewed
Direct linkDirect link
Hug, Sven E.; Ochsner, Michael – Research Evaluation, 2022
This study examines a basic assumption of peer review, namely, the idea that there is a consensus on evaluation criteria among peers, which is a necessary condition for the reliability of peer judgements. Empirical evidence indicating that there is no consensus or more than one consensus would offer an explanation for the "disagreement…
Descriptors: Peer Evaluation, Grants, Evaluation Criteria, Interrater Reliability
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Donoghue, John R.; McClellan, Catherine A.; Hess, Melinda R. – ETS Research Report Series, 2022
When constructed-response items are administered for a second time, it is necessary to evaluate whether the current Time B administration's raters have drifted from the scoring of the original administration at Time A. To study this, Time A papers are sampled and rescored by Time B scorers. Commonly the scores are compared using the proportion of…
Descriptors: Item Response Theory, Test Construction, Scoring, Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Ole J. Kemi – Advances in Physiology Education, 2025
Students are assessed by coursework and/or exams, all of which are marked by assessors (markers). Student and marker performances are then subject to end-of-session board of examiner handling and analysis. This occurs annually and is the basis for evaluating students but also the wider learning and teaching efficiency of an academic institution.…
Descriptors: Undergraduate Students, Evaluation Methods, Evaluation Criteria, Academic Standards
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Wahyu Nanda Eka Saputra; Trikinasih Handayani; Prima Suci Rohmadheny; Rohmatus Naini; Dody Hartanto; Hardi Santosa; Dewi Afra Khairunnisa; Risma Risansyah; Hanan Riati; Faturrahman – Journal of Education and Learning (EduLearn), 2025
The students are urged to do something without expecting anything in return and only in the name of God. Every islamic student becomes something ideal if they can internalize and implement sincerity. Many people are willing to do something because of an ulterior motive. The importance of sincerity in humans is the background for developing a…
Descriptors: Islam, Interrater Reliability, Prosocial Behavior, Muslims
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Xiner Liu; Andres Felipe Zambrano; Ryan S. Baker; Amanda Barany; Jaclyn Ocumpaugh; Jiayi Zhang; Maciej Pankiewicz; Nidhi Nasiar; Zhanlan Wei – Journal of Learning Analytics, 2025
This study explores the potential of the large language model GPT-4 as an automated tool for qualitative data analysis by educational researchers, exploring which techniques are most successful for different types of constructs. Specifically, we assess three different prompt engineering strategies -- Zero-shot, Few-shot, and Fewshot with…
Descriptors: Coding, Artificial Intelligence, Automation, Data Analysis
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sevilmis, Ali; Yildiz, Özer – International Journal of Progressive Education, 2021
Reliability that can be proved by numeric indicators in quantitative studies has become a very discussible issue. The reason for this is to be thought that in qualitative researches, reliability is not based on positive perspective and those forming reliability criteria is difficult. However, for testing the reliability of a qualitative study or…
Descriptors: Interrater Reliability, Qualitative Research, Educational Research, Physical Education
Peer reviewed Peer reviewed
Direct linkDirect link
Fuentes, Milton A.; Reyes-Portillo, Jazmin A.; Tineo, Petty; Gonzalez, Kenny; Butt, Mamona – Hispanic Journal of Behavioral Sciences, 2021
While skin color is relevant and important in the Latinx community, as it is associated with colorism, little is known about how often it is measured or the best way to measure it. This article presents results from two studies examining these key concerns in three prominent journals, where Latinx research is typically published (i.e., the…
Descriptors: Hispanic Americans, Measures (Individuals), Undergraduate Students, Social Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Norouzian, Reza – Studies in Second Language Acquisition, 2021
There has recently been a surge of interest in improving the replicability of second language (L2) research. However, less attention is paid to replicability in the context of L2 meta-analyses. I argue that conducting interrater reliability (IRR) analyses is a key step toward improving the replicability of L2 meta-analyses. To that end, I first…
Descriptors: Interrater Reliability, Second Languages, Language Research, Meta Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Fromm, Davida; Katta, Saketh; Paccione, Mason; Hecht, Sophia; Greenhouse, Joel; MacWhinney, Brian; Schnur, Tatiana T. – Journal of Speech, Language, and Hearing Research, 2021
Purpose: Analysis of connected speech in the field of adult neurogenic communication disorders is essential for research and clinical purposes, yet time and expertise are often cited as limiting factors. The purpose of this project was to create and evaluate an automated program to score and compute the measures from the Quantitative Production…
Descriptors: Speech, Automation, Statistical Analysis, Adults
Peer reviewed Peer reviewed
Direct linkDirect link
Anthony, Christopher J.; Styck, Kara M.; Volpe, Robert J.; Robert, Christopher R. – School Psychology, 2023
Although originally conceived of as a marriage of direct behavioral observation and indirect behavior rating scales, recent research has indicated that Direct Behavior Ratings (DBRs) are affected by rater idiosyncrasies (rater effects) similar to other indirect forms of behavioral assessment. Most of this research has been conducted using…
Descriptors: Item Response Theory, Generalizability Theory, Interrater Reliability, Behavior Rating Scales
Peer reviewed Peer reviewed
Direct linkDirect link
Bolton, Tiffany; Stevenson, Brittney; Janes, William – Journal of Occupational Therapy, Schools & Early Intervention, 2023
Researchers utilized a cross-sectional secondary analysis of data within an ongoing non-randomized controlled trial study design to establish the reliability and internal consistency of a novel handwriting assessment for preschoolers, the Just Write! (JW), written by the authors. Seventy-eight children from an area preschool participated in the…
Descriptors: Handwriting, Writing Skills, Writing Evaluation, Preschool Children
Yvette Jackson – ProQuest LLC, 2023
Rater-mediated activities in educational research occur when an expert judge or rater utilizes an instrument to judge persons or items and generates scale scores. Scale scores are from a subjective judgment and must undergo a quality control measure called rating quality. Rating quality in this study is broadly defined as the extent to which…
Descriptors: Educational Research, Evaluators, Test Theory, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Arielle Boguslav; Julie Cohen – Journal of Teacher Education, 2024
Teacher preparation programs are increasingly expected to use data on preservice teacher (PST) skills to drive program improvement and provide targeted supports. Observational ratings are especially vital, but also prone to measurement issues. Scores may be influenced by factors unrelated to PSTs' instructional skills, including rater standards.…
Descriptors: Preservice Teachers, Measures (Individuals), Evaluation Problems, Teaching Skills
Pages: 1  |  ...  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  12  |  ...  |  207