ERIC - Search Results

Publication Date

In 2026	0
Since 2025	3
Since 2022 (last 5 years)	8
Since 2017 (last 10 years)	16
Since 2007 (last 20 years)	43

Descriptor

Evaluation Methods	90
Interrater Reliability	90
Test Reliability	90
Test Validity	46
Student Evaluation	15
Psychometrics	14
Scoring	14
Higher Education	13
Adults	12
Correlation	12
Foreign Countries	11
Test Construction	10
Measurement Techniques	9
Performance Based Assessment	9
Scores	9
Evaluation Criteria	8
Generalizability Theory	8
Measures (Individuals)	8
Peer Evaluation	8
Educational Assessment	7
Elementary Secondary Education	7
Evaluators	7
Predictive Validity	7
Rating Scales	7
Scoring Rubrics	7
More ▼

Publication Type

Journal Articles	68
Reports - Research	52
Reports - Evaluative	22
Reports - Descriptive	9
Speeches/Meeting Papers	9
Tests/Questionnaires	4
Dissertations/Theses -…	3
Guides - Non-Classroom	2
Numerical/Quantitative Data	2
Opinion Papers	2
Collected Works - Proceedings	1
Guides - General	1
Information Analyses	1
More ▼

Education Level

Higher Education	13
Postsecondary Education	10
Elementary Secondary Education	3
Adult Education	2
Grade 1	2
Preschool Education	2
Early Childhood Education	1
Elementary Education	1
Grade 2	1
Grade 6	1
Kindergarten	1
More ▼

Audience

Researchers	11
Practitioners	4
Teachers	2

Location

Australia	2
Brazil	2
Canada	2
Florida	2
Netherlands	2
Pennsylvania	2
United Kingdom	2
United Kingdom (England)	2
United States	2
Arkansas	1
Asia	1
California	1
Connecticut	1
Denmark	1
Egypt	1
Estonia	1
Finland (Helsinki)	1
Germany	1
Greece	1
Hawaii	1
India	1
Ireland	1
Israel	1
Italy	1
Japan	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…	2
Hamilton Rating Scale for…	2
Adjustment Scales for…	1
Autism Diagnostic Observation…	1
Bayley Scales of Infant…	1
Beck Anxiety Inventory	1
Child Behavior Checklist	1
Developmental Behavior…	1
Graduate Record Examinations	1
Group Assessment of Logical…	1
NEO Personality Inventory	1
National Assessment of…	1
Teacher Performance…	1
Test of English as a Foreign…	1
Vineland Adaptive Behavior…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 90 results Save | Export

Technical Adequacy-Reliability

Peer reviewed

Direct link

Susan K. Johnsen – Gifted Child Today, 2025

The author provides information about reliability and areas that educators should examine in determining if an assessment is consistent and trustworthy for use, and how it should be interpreted in making decisions about students. Reliability areas that are discussed in the column include internal consistency, test-retest or stability, inter-scorer…

Descriptors: Test Reliability, Academically Gifted, Student Evaluation, Error of Measurement

Using Automated Procedures to Score Educational Essays Written in Three Languages

Peer reviewed

Direct link

Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025

The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…

Descriptors: College Students, Slavic Languages, German, Italian

Comparison of the Results of the Generalizability Theory with the Inter-Rater Agreement Coefficients

Peer reviewed
PDF on ERIC

Download full text

Eser, Mehmet Taha; Aksu, Gökhan – International Journal of Curriculum and Instruction, 2022

The agreement between raters is examined within the scope of the concept of "inter-rater reliability". Although there are clear definitions of the concepts of agreement between raters and reliability between raters, there is no clear information about the conditions under which agreement and reliability level methods are appropriate to…

Descriptors: Generalizability Theory, Interrater Reliability, Evaluation Methods, Test Theory

Constructing a Roadmap to Measure the Quality of Business Assessments Aimed at Curriculum Management

Peer reviewed

Direct link

Silva, Thanuci; Santos, Regiane dos; Mallet, Débora – Journal of Education for Business, 2023

Assuring the quality of education is a concern of learning institutions. To do so, it is necessary to have assertive learning management, with consistent data on students' outcomes. This research provides associate deans and researchers, a roadmap with which to gather evidence to improve the quality of open-ended assessments. Based on statistical…

Descriptors: Student Evaluation, Evaluation Methods, Business Education, Higher Education

Evidence-Based Evaluation of Student and Marker Performances in Assessment and Examination

Peer reviewed

Direct link

Ole J. Kemi – Advances in Physiology Education, 2025

Students are assessed by coursework and/or exams, all of which are marked by assessors (markers). Student and marker performances are then subject to end-of-session board of examiner handling and analysis. This occurs annually and is the basis for evaluating students but also the wider learning and teaching efficiency of an academic institution.…

Descriptors: Undergraduate Students, Evaluation Methods, Evaluation Criteria, Academic Standards

Design of a Simple Rubric to Peer-Evaluate the Teamwork Skills of Engineering Students

Peer reviewed

Direct link

Swapneel Thite; Jayashri Ravishankar; Inmaculada Tomeo-Reyes; Araceli Martinez Ortiz – European Journal of Engineering Education, 2024

Effectively working in an engineering workplace requires strong teamwork skills, yet the existing literature within various disciplines reveals discrepancies in evaluating these skills. This complicates the design of a generic teamwork peer evaluation tool for engineering students. This study aims to address this gap by introducing the DRIVE…

Descriptors: Scoring Rubrics, Evaluation Methods, Peer Evaluation, Teamwork

The Value of Expanding Perspectives on Assessment

Peer reviewed

Direct link

Janice Kinghorn; Katherine McGuire; Bethany L. Miller; Aaron Zimmerman – Assessment Update, 2024

In this article, the authors share their reflections on how different experiences and paradigms have broadened their understanding of the work of assessment in higher education. As they collaborated to create a panel for the 2024 International Conference on Assessing Quality in Higher Education, they recognized that they, as assessment…

Descriptors: Higher Education, Assessment Literacy, Evaluation Criteria, Evaluation Methods

A Unified Approach to Estimating the Intraclass Correlation Coefficient and Its Bias: An Exploratory Study

Direct link

Kelvin Terrell Pompey – ProQuest LLC, 2021

Many methods are used to measure interrater reliability for studies where each target receives ratings by a different set of judges. The purpose of this study is to explore the use of hierarchical modeling for estimating interrater reliability using the intraclass correlation coefficient. This study provides a description of how the ICC can be…

Descriptors: Interrater Reliability, Evaluation Methods, Test Reliability, Correlation

Practices in Instrument Use and Development in "Chemistry Education Research and Practice" 2010-2021

Peer reviewed

Direct link

Lazenby, Katherine; Tenney, Kristin; Marcroft, Tina A.; Komperda, Regis – Chemistry Education Research and Practice, 2023

Assessment instruments that generate quantitative data on attributes (cognitive, affective, behavioral, "etc.") of participants are commonly used in the chemistry education community to draw conclusions in research studies or inform practice. Recently, articles and editorials have stressed the importance of providing evidence for the…

Descriptors: Chemistry, Periodicals, Journal Articles, Science Education

Exploring Rating Quality in the Context of High-Stakes Rater-Mediated Educational Assessments

Direct link

Wenjing Guo – ProQuest LLC, 2021

Constructed response (CR) items are widely used in large-scale testing programs, including the National Assessment of Educational Progress (NAEP) and many district and state-level assessments in the United States. One unique feature of CR items is that they depend on human raters to assess the quality of examinees' work. The judgment of human…

Descriptors: National Competency Tests, Responses, Interrater Reliability, Error of Measurement

A Novel Means-End Problem-Solving Assessment Tool for Early Intervention: Evaluation of Validity, Reliability, and Sensitivity

Peer reviewed
PDF on ERIC

Download full text

Direct link

Baraldi Cunha, Andrea; Babik, Iryna; Koziol, Natalie A.; Hsu, Lin-Ya; Nord, Jayden; Harbourne, Regina T.; Westcott-McCoy, Sarah; Dusing, Stacey C.; Bovaird, James A.; Lobo, Michele A. – Grantee Submission, 2021

Purpose: To evaluate the validity, reliability, and sensitivity of the novel Means-End Problem-Solving Assessment Tool (MEPSAT). Methods: Children with typical development and those with motor delay were assessed throughout the first 2 years of life using the MEPSAT. MEPSAT scores were validated against the cognitive and motor subscales of the…

Descriptors: Problem Solving, Early Intervention, Evaluation Methods, Motor Development

The Counseling Competencies Scale: Validation and Refinement

Peer reviewed

Direct link

Lambie, Glenn W.; Mullen, Patrick R.; Swank, Jacqueline M.; Blount, Ashley – Measurement and Evaluation in Counseling and Development, 2018

Supervisors evaluated counselors-in-training at multiple points during their practicum experience using the Counseling Competencies Scale (CCS; N = 1,070). The CCS evaluations were randomly split to conduct exploratory factor analysis and confirmatory factor analysis, resulting in a 2-factor model (61.5% of the variance explained).

Descriptors: Counselor Training, Counseling, Measures (Individuals), Competence

Processes and Procedures for Estimating Score Reliability and Precision

Peer reviewed

Direct link

Bardhoshi, Gerta; Erford, Bradley T. – Measurement and Evaluation in Counseling and Development, 2017

Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…

Descriptors: Scores, Test Reliability, Accuracy, Pretests Posttests

Measuring Program Quality, Part 2: Addressing Potential Cultural Bias in a Rater Reliability Exam

Peer reviewed
PDF on ERIC

Download full text

Richer, Amanda; Charmaraman, Linda; Ceder, Ineke – Afterschool Matters, 2018

Like instruments used in afterschool programs to assess children's social and emotional growth or to evaluate staff members' performance, instruments used to evaluate program quality should be free from bias. Practitioners and researchers alike want to know that assessment instruments, whatever their type or intent, treat all people fairly and do…

Descriptors: Cultural Differences, Social Bias, Interrater Reliability, Program Evaluation

Discrepancies between Students' and Teachers' Ratings of Instructional Practice: A Way to Measure Classroom Intuneness and Evaluate Teaching Quality

Direct link

Dockterman, Daniel Milo – ProQuest LLC, 2017

Student surveys have gained prominence in recent years as a way to give students a voice in their learning process, and teacher self-reports have always been an effective instrument for revealing the planning, intentions, and expectations behind a given lesson. Though student and teacher surveys are widely used, extant research in education has…

Descriptors: Outcome Measures, Teacher Evaluation, Student Evaluation of Teacher Performance, Evaluation Methods

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6

Journal of Autism and…	4
ProQuest LLC	3
Research in Developmental…	3
Assessment	2
College Teaching	2
Early Childhood Research…	2
Gerontologist	2
International Journal of…	2
Measurement and Evaluation in…	2
Multivariate Behavioral…	2
Research in Developmental…	2
Academic Psychiatry	1
Advances in Physiology…	1
Afterschool Matters	1
American Journal of Evaluation	1
American Journal of…	1
Applied Measurement in…	1
Art Therapy: Journal of the…	1
Assessment & Evaluation in…	1
Assessment Update	1
Behavioral Disorders	1
Canadian Modern Language…	1
Chemistry Education Research…	1
Child Abuse & Neglect: The…	1
Diagnostique	1
More ▼

Matson, Johnny L.	2
Aaron Zimmerman	1
Abedi, Jamal	1
Aghbar, Ali-Asghar	1
Aksu, Gökhan	1
Apache, R. R.	1
Araceli Martinez Ortiz	1
Arntz, Arnoud	1
Babik, Iryna	1
Baer, John	1
Baglio, Christopher S.	1
Bamburg, Jay W.	1
Baraldi Cunha, Andrea	1
Bardhoshi, Gerta	1
Barnes, W. Harvin	1
Baumert, Jurgen	1
Bejar, Isaac I.	1
Bernstein, David	1
Bethany L. Miller	1
Bielecki, JoAnne	1
Blount, Ashley	1
Boccaccini, Marcus T.	1
Bovaird, James A.	1
Boyce, Steven J.	1
More ▼