ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	3
Since 2017 (last 10 years)	10
Since 2007 (last 20 years)	29

Descriptor

Evaluation Methods	38
Interrater Reliability	38
Statistical Analysis	38
Correlation	11
Foreign Countries	7
Measures (Individuals)	7
Evaluators	5
Test Reliability	5
Comparative Analysis	4
Interviews	4
Measurement Techniques	4
Observation	4
Psychometrics	4
Rating Scales	4
Scoring	4
Student Evaluation	4
Validity	4
Writing Evaluation	4
Accuracy	3
Coding	3
College Students	3
Evaluation Criteria	3
Evidence	3
Language Tests	3
Likert Scales	3
More ▼

Publication Type

Journal Articles	34
Reports - Research	25
Reports - Evaluative	9
Tests/Questionnaires	3
Information Analyses	2
Dissertations/Theses -…	1
Guides - Non-Classroom	1
Opinion Papers	1
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

Higher Education	11
Postsecondary Education	8
Early Childhood Education	3
Adult Education	1
Elementary Education	1
Elementary Secondary Education	1
Grade 2	1
Grade 7	1
Middle Schools	1
Preschool Education	1
Primary Education	1
More ▼

Audience

Practitioners

Location

Florida	2
Netherlands	2
Asia	1
Colombia	1
Malaysia	1
Nigeria	1
North Carolina	1
United Kingdom (Great Britain)	1

Laws, Policies, & Programs

Assessments and Surveys

Autism Diagnostic Observation…	1
Child Behavior Checklist	1
MacArthur Communicative…	1
Mullen Scales of Early…	1
Reading Miscue Inventory	1

What Works Clearinghouse Rating

Showing 1 to 15 of 38 results Save | Export

Agree to Disagree: Multiple Methods to Assess Rater Agreement during Student Teaching

Peer reviewed

Direct link

Elayne P. Colón; Lori M. Dassa; Thomas M. Dana; Nathan P. Hanson – Action in Teacher Education, 2024

To meet accreditation expectations, teacher preparation programs must demonstrate their candidates are evaluated using summative assessment tools that yield sound, reliable, and valid data. These tools are primarily used by the clinical experience team -- university supervisors and mentor teachers. Institutional beliefs regarding best practices…

Descriptors: Student Teachers, Teacher Interns, Evaluation Methods, Interrater Reliability

Reliability Evidence for the NC Teacher Evaluation Process Using a Variety of Indicators of Inter-Rater Agreement

Peer reviewed
PDF on ERIC

Download full text

Holcomb, T. Scott; Lambert, Richard; Bottoms, Bryndle L. – Journal of Educational Supervision, 2022

In this study, various statistical indexes of agreement were calculated using empirical data from a group of evaluators (n = 45) of early childhood teachers. The group of evaluators rated ten fictitious teacher profiles using the North Carolina Teacher Evaluation Process (NCTEP) rubric. The exact and adjacent agreement percentages were calculated…

Descriptors: Interrater Reliability, Teacher Evaluation, Statistical Analysis, Early Childhood Teachers

Practices in Instrument Use and Development in "Chemistry Education Research and Practice" 2010-2021

Peer reviewed

Direct link

Lazenby, Katherine; Tenney, Kristin; Marcroft, Tina A.; Komperda, Regis – Chemistry Education Research and Practice, 2023

Assessment instruments that generate quantitative data on attributes (cognitive, affective, behavioral, "etc.") of participants are commonly used in the chemistry education community to draw conclusions in research studies or inform practice. Recently, articles and editorials have stressed the importance of providing evidence for the…

Descriptors: Chemistry, Periodicals, Journal Articles, Science Education

Assessment of Interrater and Intermethod Agreement in the Kinesiology Literature

Peer reviewed

Direct link

Looney, Marilyn A. – Measurement in Physical Education and Exercise Science, 2018

The purpose of this article was two-fold (1) provide an overview of the commonly reported and under-reported absolute agreement indices in the kinesiology literature for continuous data; and (2) present examples of these indices for hypothetical data along with recommendations for future use. It is recommended that three types of information be…

Descriptors: Interrater Reliability, Evaluation Methods, Kinetics, Indexes

A Systematic Review of Methods for Evaluating Rating Quality in Language Assessment

Peer reviewed

Direct link

Wind, Stefanie A.; Peterson, Meghan E. – Language Testing, 2018

The use of assessments that require rater judgment (i.e., rater-mediated assessments) has become increasingly popular in high-stakes language assessments worldwide. Using a systematic literature review, the purpose of this study is to identify and explore the dominant methods for evaluating rating quality within the context of research on…

Descriptors: Language Tests, Evaluators, Evaluation Methods, Interrater Reliability

An Unbiased Estimate of Global Interrater Agreement

Peer reviewed

Direct link

Cousineau, Denis; Laurencelle, Louis – Educational and Psychological Measurement, 2017

Assessing global interrater agreement is difficult as most published indices are affected by the presence of mixtures of agreements and disagreements. A previously proposed method was shown to be specifically sensitive to global agreement, excluding mixtures, but also negatively biased. Here, we propose two alternatives in an attempt to find what…

Descriptors: Interrater Reliability, Evaluation Methods, Statistical Bias, Accuracy

Appraising the Scoring Performance of Automated Essay Scoring Systems--Some Additional Considerations: Which Essays? Which Human Raters? Which Scores?

Peer reviewed

Direct link

Raczynski, Kevin; Cohen, Allan – Applied Measurement in Education, 2018

The literature on Automated Essay Scoring (AES) systems has provided useful validation frameworks for any assessment that includes AES scoring. Furthermore, evidence for the scoring fidelity of AES systems is accumulating. Yet questions remain when appraising the scoring performance of AES systems. These questions include: (a) which essays are…

Descriptors: Essay Tests, Test Scoring Machines, Test Validity, Evaluators

The Counseling Competencies Scale: Validation and Refinement

Peer reviewed

Direct link

Lambie, Glenn W.; Mullen, Patrick R.; Swank, Jacqueline M.; Blount, Ashley – Measurement and Evaluation in Counseling and Development, 2018

Supervisors evaluated counselors-in-training at multiple points during their practicum experience using the Counseling Competencies Scale (CCS; N = 1,070). The CCS evaluations were randomly split to conduct exploratory factor analysis and confirmatory factor analysis, resulting in a 2-factor model (61.5% of the variance explained).

Descriptors: Counselor Training, Counseling, Measures (Individuals), Competence

Interrater Agreement Evaluation: A Latent Variable Modeling Approach

Peer reviewed

Direct link

Raykov, Tenko; Dimitrov, Dimiter M.; von Eye, Alexander; Marcoulides, George A. – Educational and Psychological Measurement, 2013

A latent variable modeling method for evaluation of interrater agreement is outlined. The procedure is useful for point and interval estimation of the degree of agreement among a given set of judges evaluating a group of targets. In addition, the approach allows one to test for identity in underlying thresholds across raters as well as to identify…

Descriptors: Interrater Reliability, Models, Statistical Analysis, Computation

Functional Adequacy in L2 Writing: Towards a New Rating Scale

Peer reviewed

Direct link

Kuiken, Folkert; Vedder, Ineke – Language Testing, 2017

The importance of functional adequacy as an essential component of L2 proficiency has been observed by several authors (Pallotti, 2009; De Jong, Steinel, Florijn, Schoonen, & Hulstijn, 2012a, b). The rationale underlying the present study is that the assessment of writing proficiency in L2 is not fully possible without taking into account the…

Descriptors: Second Language Learning, Rating Scales, Computational Linguistics, Persuasive Discourse

Examining the Reliability of Scores from the Consensual Assessment Technique in the Measurement of Individual and Small Group Creativity

Peer reviewed

Direct link

Stefanic, Nicholas; Randles, Clint – Music Education Research, 2015

The purpose of this study was to explore the reliability of measures of both individual and group creative work using the consensual assessment technique (CAT). CAT was used to measure individual and group creativity among a population of pre-service music teachers enrolled in a secondary general music class (n = 23) and was evaluated from…

Descriptors: Music Education, Creativity, Preservice Teachers, Music Teachers

Does the Brief Observation of Social Communication Change Help Moving Forward in Measuring Change in Early Autism Intervention Studies?

Peer reviewed

Direct link

Pijl, Mirjam K. J.; Rommelse, Nanda N. J.; Hendriks, Monica; De Korte, Manon W. P.; Buitelaar, Jan K.; Oosterling, Iris J. – Autism: The International Journal of Research and Practice, 2018

The field of early autism research is in dire need of outcome measures that adequately reflect subtle changes in core autistic behaviors. This article compares the ability of a newly developed measure, the Brief Observation of Social Communication Change (BOSCC), and the Autism Diagnostic Observation Schedule (ADOS) to detect changes in core…

Descriptors: Intervention, Autism, Interpersonal Communication, Interrater Reliability

Practicalities of Using a Modified Version of the Cochrane Collaboration Risk of Bias Tool for Randomised and Non-Randomised Study Designs Applied in a Health Technology Assessment Setting

Peer reviewed

Direct link

Robertson, Clare; Ramsay, Craig; Gurung, Tara; Mowatt, Graham; Pickard, Robert; Sharma, Pawana – Research Synthesis Methods, 2014

We describe our experience of using a modified version of the Cochrane risk of bias (RoB) tool for randomised and non-randomised comparative studies. Objectives: (1) To assess time to complete RoB assessment; (2) To assess inter-rater agreement; and (3) To explore the association between RoB and treatment effect size. Methods: Cochrane risk of…

Descriptors: Risk, Randomized Controlled Trials, Research Design, Comparative Analysis

Procedural Influence on Internal and External Assessment Scores of Undergraduate Vocational and Technical Education Research Projects in Nigerian Universities

Peer reviewed
PDF on ERIC

Download full text

A. C., John; Manabete, S. S. – Journal of Education and Practice, 2015

This study sought to determine the procedural influence on internal and external assessment scores of undergraduate research projects in vocational and technical education programmes in the university under study. A survey research design was used for the conduct of this study. The population consisted of 130 lecturers and 1,847 students in the…

Descriptors: Foreign Countries, Undergraduate Students, Student Research, Research Projects

Validation of Assessment Vignettes and Scoring Rubric of Multicultural and International Competency in Faculty Teaching

Peer reviewed

Direct link

Henderson, Sheila J.; Horton, Ruth A.; Saito, Paul K.; Shorter-Gooden, Kumea – Multicultural Learning and Teaching, 2016

The purpose of this research was to develop a new tool for assessing multicultural and international competency in faculty teaching through vignette scenarios of university classroom critical incidents--across disciplines of clinical and forensics psychology, business, and education. Construct and content validity of the initial draft vignettes…

Descriptors: Scoring Rubrics, Critical Incidents Method, Construct Validity, Content Validity

Previous Page | Next Page »

Pages: 1 | 2 | 3

Educational and Psychological…	3
Action in Teacher Education	2
Applied Measurement in…	2
Language Testing	2
Advances in Health Sciences…	1
American Journal of Evaluation	1
American Journal of…	1
Autism: The International…	1
Chemistry Education Research…	1
Early Education and…	1
Education and Information…	1
English Language Teaching	1
Infants and Young Children	1
International Education…	1
Journal of Applied Testing…	1
Journal of Education and…	1
Journal of Educational…	1
Journal of Experimental…	1
Journal of Nutrition…	1
Journal of Policy Analysis…	1
Journal on Excellence in…	1
Language Learning Journal	1
Measurement and Evaluation in…	1
Measurement in Physical…	1
Multicultural Learning and…	1
More ▼

A. C., John	1
Atkinson, Dianne	1
Bahreini, Kiavash	1
Bavier, Richard	1
Bhola, Dennison S.	1
Blount, Ashley	1
Bottoms, Bryndle L.	1
Buckendahl, Chad W.	1
Buitelaar, Jan K.	1
Burmester, Kristen O'Rourke	1
Castilla-Earls, Anny	1
Chambers, Francine	1
Chen, Yen-Yuan	1
Cohen, Allan	1
Coppit, George L.	1
Cousineau, Denis	1
De Korte, Manon W. P.	1
Dimitrov, Dimiter M.	1
Elayne P. Colón	1
Flack, Virginia F.	1
Gurung, Tara	1
Hambleton, Ronald K.	1
Hammer, Krista	1
Helou, Leah B.	1
More ▼