ERIC - Search Results

Publication Date

In 2025	2
Since 2024	5
Since 2021 (last 5 years)	9
Since 2016 (last 10 years)	14
Since 2006 (last 20 years)	43

Descriptor

Evaluation Methods	70
Interrater Reliability	70
Test Validity	70
Test Reliability	46
Psychometrics	13
Student Evaluation	12
Foreign Countries	11
Scoring	11
Measurement Techniques	10
Adults	9
Test Construction	9
Correlation	8
Evaluators	8
Measures (Individuals)	8
Check Lists	7
Children	7
Predictive Validity	7
Teacher Evaluation	7
College Students	6
Rating Scales	6
Scoring Rubrics	6
Adolescents	5
Evaluation Criteria	5
Higher Education	5
Language Tests	5
More ▼

Publication Type

Journal Articles	54
Reports - Research	46
Reports - Evaluative	18
Speeches/Meeting Papers	7
Reports - Descriptive	5
Tests/Questionnaires	4
Information Analyses	3
Numerical/Quantitative Data	3
Dissertations/Theses -…	1
Guides - General	1
Guides - Non-Classroom	1
More ▼

Education Level

Higher Education	12
Postsecondary Education	8
Elementary Education	2
Elementary Secondary Education	2
Grade 1	2
Early Childhood Education	1
Grade 2	1
Grade 3	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Kindergarten	1
Middle Schools	1
Preschool Education	1
More ▼

Audience

Researchers

Location

Pennsylvania	2
United Kingdom	2
Arkansas	1
Australia	1
California	1
Canada	1
China	1
Denmark	1
Florida	1
India	1
Israel	1
Lebanon	1
Netherlands	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Child Behavior Checklist	2
Bayley Scales of Infant…	1
Beck Anxiety Inventory	1
Behavior Assessment System…	1
Behavioral and Emotional…	1
Developmental Behavior…	1
Group Assessment of Logical…	1
Hamilton Rating Scale for…	1
Raven Progressive Matrices	1
Teacher Performance…	1
Test of English as a Foreign…	1
Vineland Adaptive Behavior…	1
Woodcock Reading Mastery Test	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 70 results Save | Export

Using Automated Procedures to Score Educational Essays Written in Three Languages

Peer reviewed

Direct link

Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025

The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…

Descriptors: College Students, Slavic Languages, German, Italian

Examining Inter-Rater Reliability of Evaluators Judging Teacher Performance: Proposing an Alternative to Cohen's Kappa. CEME Technical Report. CEMETR-2021-06

Download full text

Lambert, Richard G.; Holcomb, T. Scott; Bottoms, Bryndle L. – Center for Educational Measurement and Evaluation, 2021

The validity of the Kappa coefficient of chance-corrected agreement has been questioned when the prevalence of specific rating scale categories is low and agreement between raters is high. The researchers proposed the Lambda Coefficient of Rater-Mediated Agreement as an alternative to Kappa to address these concerns. Lambda corrects for chance…

Descriptors: Interrater Reliability, Teacher Evaluation, Test Validity, Evaluation Methods

Raters' Scoring Process in Assessment of Interpreting: An Empirical Study Based on Eye Tracking and Retrospective Verbalisation

Peer reviewed

Direct link

Chao Han; Binghan Zheng; Mingqing Xie; Shirong Chen – Interpreter and Translator Trainer, 2024

Human raters' assessment of interpreting is a complex process. Previous researchers have mainly relied on verbal reports to examine this process. To advance our understanding, we conducted an empirical study, collecting raters' eye-movement and retrospection data in a computerised interpreting assessment in which three groups of raters (n = 35)…

Descriptors: Foreign Countries, College Students, College Graduates, Interrater Reliability

Evidence-Based Evaluation of Student and Marker Performances in Assessment and Examination

Peer reviewed

Direct link

Ole J. Kemi – Advances in Physiology Education, 2025

Students are assessed by coursework and/or exams, all of which are marked by assessors (markers). Student and marker performances are then subject to end-of-session board of examiner handling and analysis. This occurs annually and is the basis for evaluating students but also the wider learning and teaching efficiency of an academic institution.…

Descriptors: Undergraduate Students, Evaluation Methods, Evaluation Criteria, Academic Standards

Design of a Simple Rubric to Peer-Evaluate the Teamwork Skills of Engineering Students

Peer reviewed

Direct link

Swapneel Thite; Jayashri Ravishankar; Inmaculada Tomeo-Reyes; Araceli Martinez Ortiz – European Journal of Engineering Education, 2024

Effectively working in an engineering workplace requires strong teamwork skills, yet the existing literature within various disciplines reveals discrepancies in evaluating these skills. This complicates the design of a generic teamwork peer evaluation tool for engineering students. This study aims to address this gap by introducing the DRIVE…

Descriptors: Scoring Rubrics, Evaluation Methods, Peer Evaluation, Teamwork

The Value of Expanding Perspectives on Assessment

Peer reviewed

Direct link

Janice Kinghorn; Katherine McGuire; Bethany L. Miller; Aaron Zimmerman – Assessment Update, 2024

In this article, the authors share their reflections on how different experiences and paradigms have broadened their understanding of the work of assessment in higher education. As they collaborated to create a panel for the 2024 International Conference on Assessing Quality in Higher Education, they recognized that they, as assessment…

Descriptors: Higher Education, Assessment Literacy, Evaluation Criteria, Evaluation Methods

Practices in Instrument Use and Development in "Chemistry Education Research and Practice" 2010-2021

Peer reviewed

Direct link

Lazenby, Katherine; Tenney, Kristin; Marcroft, Tina A.; Komperda, Regis – Chemistry Education Research and Practice, 2023

Assessment instruments that generate quantitative data on attributes (cognitive, affective, behavioral, "etc.") of participants are commonly used in the chemistry education community to draw conclusions in research studies or inform practice. Recently, articles and editorials have stressed the importance of providing evidence for the…

Descriptors: Chemistry, Periodicals, Journal Articles, Science Education

A Novel Means-End Problem-Solving Assessment Tool for Early Intervention: Evaluation of Validity, Reliability, and Sensitivity

Peer reviewed
PDF on ERIC

Download full text

Direct link

Baraldi Cunha, Andrea; Babik, Iryna; Koziol, Natalie A.; Hsu, Lin-Ya; Nord, Jayden; Harbourne, Regina T.; Westcott-McCoy, Sarah; Dusing, Stacey C.; Bovaird, James A.; Lobo, Michele A. – Grantee Submission, 2021

Purpose: To evaluate the validity, reliability, and sensitivity of the novel Means-End Problem-Solving Assessment Tool (MEPSAT). Methods: Children with typical development and those with motor delay were assessed throughout the first 2 years of life using the MEPSAT. MEPSAT scores were validated against the cognitive and motor subscales of the…

Descriptors: Problem Solving, Early Intervention, Evaluation Methods, Motor Development

Fairness in Oral Language Assessment: Training Raters and Considering Examinees' Expectations

Peer reviewed
PDF on ERIC

Download full text

Doosti, Mehdi; Ahmadi Safa, Mohammad – International Journal of Language Testing, 2021

This study examined the effect of rater training on promoting inter-rater reliability in oral language assessment. It also investigated whether rater training and the consideration of the examinees' expectations by the examiners have any effect on test-takers' perceptions of being fairly evaluated. To this end, four raters scored 31 Iranian…

Descriptors: Oral Language, Language Tests, Interrater Reliability, Training

Applying Kane's Validity Framework to a Simulation Based Assessment of Clinical Competence

Peer reviewed

Direct link

Tavares, Walter; Brydges, Ryan; Myre, Paul; Prpic, Jason; Turner, Linda; Yelle, Richard; Huiskamp, Maud – Advances in Health Sciences Education, 2018

Assessment of clinical competence is complex and inference based. Trustworthy and defensible assessment processes must have favourable evidence of validity, particularly where decisions are considered high stakes. We aimed to organize, collect and interpret validity evidence for a high stakes simulation based assessment strategy for certifying…

Descriptors: Competence, Simulation, Allied Health Personnel, Certification

Appraising the Scoring Performance of Automated Essay Scoring Systems--Some Additional Considerations: Which Essays? Which Human Raters? Which Scores?

Peer reviewed

Direct link

Raczynski, Kevin; Cohen, Allan – Applied Measurement in Education, 2018

The literature on Automated Essay Scoring (AES) systems has provided useful validation frameworks for any assessment that includes AES scoring. Furthermore, evidence for the scoring fidelity of AES systems is accumulating. Yet questions remain when appraising the scoring performance of AES systems. These questions include: (a) which essays are…

Descriptors: Essay Tests, Test Scoring Machines, Test Validity, Evaluators

Measuring Program Quality, Part 2: Addressing Potential Cultural Bias in a Rater Reliability Exam

Peer reviewed
PDF on ERIC

Download full text

Richer, Amanda; Charmaraman, Linda; Ceder, Ineke – Afterschool Matters, 2018

Like instruments used in afterschool programs to assess children's social and emotional growth or to evaluate staff members' performance, instruments used to evaluate program quality should be free from bias. Practitioners and researchers alike want to know that assessment instruments, whatever their type or intent, treat all people fairly and do…

Descriptors: Cultural Differences, Social Bias, Interrater Reliability, Program Evaluation

Discrepancies between Students' and Teachers' Ratings of Instructional Practice: A Way to Measure Classroom Intuneness and Evaluate Teaching Quality

Direct link

Dockterman, Daniel Milo – ProQuest LLC, 2017

Student surveys have gained prominence in recent years as a way to give students a voice in their learning process, and teacher self-reports have always been an effective instrument for revealing the planning, intentions, and expectations behind a given lesson. Though student and teacher surveys are widely used, extant research in education has…

Descriptors: Outcome Measures, Teacher Evaluation, Student Evaluation of Teacher Performance, Evaluation Methods

ITC Guidelines for the Large-Scale Assessment of Linguistically and Culturally Diverse Populations

Peer reviewed

Direct link

International Journal of Testing, 2019

These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…

Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage

Response to "Rating Teachers Cheaper, Faster, and Better: Not so Fast": It's About Evidence

Peer reviewed

Direct link

Gargani, John; Strong, Michael – Journal of Teacher Education, 2015

In Gargani and Strong (2014), we describe The Rapid Assessment of Teacher Effectiveness (RATE), a new teacher evaluation instrument. Our account of the validation research associated with RATE inspired a review by Good and Lavigne (2015). Here, we reply to the main points of their review. We elaborate on the validity, reliability, theoretical…

Descriptors: Evidence, Teacher Effectiveness, Teacher Evaluation, Evaluation Methods

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Journal of Autism and…	3
Advances in Health Sciences…	2
Applied Measurement in…	2
Assessment	2
Early Childhood Research…	2
Journal of Educational…	2
Research in Developmental…	2
Academic Psychiatry	1
Advances in Physiology…	1
Afterschool Matters	1
American Journal of…	1
Art Therapy: Journal of the…	1
Assessment Update	1
Behavioral Disorders	1
Center for Educational…	1
Chemistry Education Research…	1
Child Abuse & Neglect: The…	1
Clinical Linguistics &…	1
Counselor Education and…	1
ETS Research Report Series	1
Education and Treatment of…	1
Educational Research	1
European Journal of…	1
Exceptional Children	1
Gerontologist	1
More ▼

Bejar, Isaac I.	2
Aaron Zimmerman	1
Aghbar, Ali-Asghar	1
Ahmadi Safa, Mohammad	1
Allinder, Rose M.	1
Apache, R. R.	1
Araceli Martinez Ortiz	1
Arnold, Suzanne C.	1
Arntz, Arnoud	1
Auld, Megan Louise	1
Babik, Iryna	1
Ball, Martin J.	1
Baraldi Cunha, Andrea	1
Beckman, Thomas J.	1
Beech, Anthony	1
Bernstein, David	1
Bethany L. Miller	1
Binghan Zheng	1
Boccaccini, Marcus T.	1
Borko, Hilda	1
Bottoms, Bryndle L.	1
Bovaird, James A.	1
Boyce, Steven J.	1
Boyd, Roslyn Nancy	1
More ▼