ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	10
Since 2017 (last 10 years)	31
Since 2007 (last 20 years)	99

Descriptor

Interrater Reliability	144
Models	144
Evaluation Methods	30
Foreign Countries	26
Correlation	21
Measurement Techniques	21
Scores	20
Evaluators	18
Scoring	17
Comparative Analysis	16
Rating Scales	16
Statistical Analysis	15
Reliability	14
Validity	14
Coding	13
Computer Software	12
Item Response Theory	12
Scoring Rubrics	12
Measures (Individuals)	11
Academic Achievement	10
Classification	10
Language Tests	10
Test Items	10
Elementary Secondary Education	9
Essays	9
More ▼

Publication Type

Journal Articles	106
Reports - Research	97
Reports - Evaluative	31
Speeches/Meeting Papers	19
Reports - Descriptive	10
Information Analyses	6
Tests/Questionnaires	5
Dissertations/Theses -…	4
Numerical/Quantitative Data	3
Guides - Non-Classroom	2
Opinion Papers	2
Collected Works - Proceedings	1
More ▼

Education Level

Higher Education	25
Postsecondary Education	18
Elementary Education	15
Secondary Education	12
Elementary Secondary Education	10
Middle Schools	8
High Schools	7
Grade 6	5
Intermediate Grades	5
Adult Education	3
Grade 4	3
Grade 7	3
Grade 8	3
Junior High Schools	3
Kindergarten	3
Early Childhood Education	2
Grade 1	2
Grade 5	2
Grade 9	2
Preschool Education	2
Grade 10	1
Grade 2	1
Grade 3	1
More ▼

Audience

Researchers	11
Policymakers	1
Practitioners	1

Location

Germany	4
Netherlands	4
Florida	3
Sweden	3
Estonia	2
Indonesia	2
Israel	2
North Carolina	2
Norway	2
Oregon	2
Pennsylvania	2
Singapore	2
Taiwan	2
United Kingdom	2
Arizona	1
Arkansas	1
Asia	1
Australia	1
Brazil	1
California	1
Canada	1
China (Shanghai)	1
Connecticut	1
Denmark	1
Egypt	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	2
Advanced Placement…	1
Graduate Record Examinations	1
Home Observation for…	1
Praxis Series	1

What Works Clearinghouse Rating

Does not meet standards

Showing 1 to 15 of 144 results Save | Export

The Use of Annotations to Explain Labels: Comparing Results from a Human-Rater Approach to a Deep Learning Approach

Peer reviewed

Direct link

Lottridge, Susan; Woolf, Sherri; Young, Mackenzie; Jafari, Amir; Ormerod, Chris – Journal of Computer Assisted Learning, 2023

Background: Deep learning methods, where models do not use explicit features and instead rely on implicit features estimated during model training, suffer from an explainability problem. In text classification, saliency maps that reflect the importance of words in prediction are one approach toward explainability. However, little is known about…

Descriptors: Documentation, Learning Strategies, Models, Prediction

An Experimental Study of Standard Setting Methods for Diagnostic Profiles

Direct link

Feldberg, Zachary R. – ProQuest LLC, 2023

Cognitive diagnostic models (CDMs) provide pedagogically relevant information in the form of a student profile of multiple binary categorizations of students into mastery or nonmastery statuses on latent traits called attributes. Federal educational accountability requires accountability measures to designate students into one of at least three…

Descriptors: Accountability, Standards, Cutting Scores, Models

Citation Metrics and Boyer's Model of Scholarship: How Do Bibliometrics and Altmetrics Respond to Research Impact?

Peer reviewed

Direct link

Gilstrap, Donald L.; Whitver, Sara Maurice; Scalfani, Vincent F.; Bray, Nathaniel J. – Innovative Higher Education, 2023

This article explores how well bibliometrics and altmetrics reflect research impact in relation to Boyer's Model of the Scholarship. Indices used for both types of metrics are explored and discussed while including an analysis on primary methodological works performed on each in the literature to date. As confirmatory in nature, we chose as our…

Descriptors: Bibliometrics, Models, Scholarship, Research

Measuring and Visualizing Coders' Reliability: New Approaches and Guidelines from Experimental Data

Peer reviewed

Direct link

Lamprianou, Iasonas – Sociological Methods & Research, 2023

This study investigates inter- and intracoder reliability, proposing a new approach based on social network analysis (SNA) and exponential random graph models (ERGM). During a recent exit poll, the responses of voters to two open-ended questions were recorded. A coding experiment was conducted where a group of coders coded a sample of text…

Descriptors: Interrater Reliability, Coding, Social Networks, Network Analysis

The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues

Peer reviewed
PDF on ERIC

Download full text

Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022

How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…

Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making

Development of Assessment Instrument Business Proposal for Students' Systems Thinking Skills on Business Model Canvas in Bioentrepreneurship Course

Peer reviewed
PDF on ERIC

Download full text

Lulu Desia Mutiani Rahmayuni; Siti Sriyatib; Diah Kusumawaty – Journal of Biological Education Indonesia (Jurnal Pendidikan Biologi Indonesia), 2024

Business Model Canvas (BMC) is a business model that must be mastered by students in the Bioentrepreneurship course as an initial provision for entering the entrepreneurial world, while in compiling Business Model Canvas (BMC) systematic thinking skills are needed. This study aims to provide an assessment instrument to measure students' system…

Descriptors: Systems Approach, Thinking Skills, Models, Business Administration

Posterior Predictive Model Checking of the Hierarchical Rater Model

Direct link

Nnamdi Chika Ezike – ProQuest LLC, 2022

Fitting wrongly specified models to observed data may lead to invalid inferences about the model parameters of interest. The current study investigated the performance of the posterior predictive model checking (PPMC) approach in detecting model-data misfit of the hierarchical rater model (HRM). The HRM is a rater-mediated model that incorporates…

Descriptors: Prediction, Models, Interrater Reliability, Item Response Theory

Evaluating Quadratic Weighted Kappa as the Standard Performance Metric for Automated Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Doewes, Afrizal; Kurdhi, Nughthoh Arfawi; Saxena, Akrati – International Educational Data Mining Society, 2023

Automated Essay Scoring (AES) tools aim to improve the efficiency and consistency of essay scoring by using machine learning algorithms. In the existing research work on this topic, most researchers agree that human-automated score agreement remains the benchmark for assessing the accuracy of machine-generated scores. To measure the performance of…

Descriptors: Essays, Writing Evaluation, Evaluators, Accuracy

Analytic or Holistic: A Study of Agreement between Different Grading Models

Peer reviewed
PDF on ERIC

Download full text

Jönsson, Anders; Balan, Andreia – Practical Assessment, Research & Evaluation, 2018

Research on teachers' grading has shown that there is great variability among teachers regarding both the process and product of grading, resulting in low comparability and issues of inequality when using grades for selection purposes. Despite this situation, not much is known about the merits or disadvantages of different models for grading. In…

Descriptors: Grading, Models, Reliability, Validity

Peace Guidance Based on the Perspective of "Markesot": Acceptability and Effectiveness of Reducing Student Aggressiveness

Peer reviewed
PDF on ERIC

Download full text

Purwadi; Saputra, Wahyu N. E.; Handaka, Irvan B.; Barida, Muya; Wahyudi, Amien; Widyastuti, Dian A.; Agungbudiprabowo; Rodhiya, Zaenab A. – Pegem Journal of Education and Instruction, 2022

This study aims to identify the acceptability and effectiveness of peace guidance based on the perspective of Markesot. This model seeks to reduce student aggressiveness. This study uses the research and development stages by adapting the Borg & Gall model. The participants of this study were 275 students who were taken randomly. The study…

Descriptors: Peace, Guidance, Models, Interrater Reliability

Pedagogical Considerations for Examining Rater Variability in Rater-Mediated Assessments: A Three-Model Framework

Peer reviewed

Direct link

Wesolowski, Brian C.; Wind, Stefanie A. – Journal of Educational Measurement, 2019

Rater-mediated assessments are a common methodology for measuring persons, investigating rater behavior, and/or defining latent constructs. The purpose of this article is to provide a pedagogical framework for examining rater variability in the context of rater-mediated assessments using three distinct models. The first model is the observation…

Descriptors: Interrater Reliability, Models, Observation, Measurement

A New Facets Model for Rater's Centrality/Extremity Response Style

Peer reviewed

Direct link

Jin, Kuan-Yu; Wang, Wen-Chung – Journal of Educational Measurement, 2018

The Rasch facets model was developed to account for facet data, such as student essays graded by raters, but it accounts for only one kind of rater effect (severity). In practice, raters may exhibit various tendencies such as using middle or extreme scores in their ratings, which is referred to as the rater centrality/extremity response style. To…

Descriptors: Scoring, Models, Interrater Reliability, Computation

Accounting for Rater Effects with the Hierarchical Rater Model Framework When Scoring Simple Structured Constructed Response Tests

Peer reviewed

Direct link

Nieto, Ricardo; Casabianca, Jodi M. – Journal of Educational Measurement, 2019

Many large-scale assessments are designed to yield two or more scores for an individual by administering multiple sections measuring different but related skills. Multidimensional tests, or more specifically, simple structured tests, such as these rely on multiple multiple-choice and/or constructed responses sections of items to generate multiple…

Descriptors: Tests, Scoring, Responses, Test Items

Computer-Programmed Decision Trees for Assessing Teacher Noticing

Peer reviewed

Direct link

Schack, Edna O.; Dueber, David; Thomas, Jonathan Norris; Fisher, Molly H.; Jong, Cindy – AERA Online Paper Repository, 2019

Scoring of teachers' noticing responses is typically burdened with rater bias and reliance upon interrater consensus. The authors sought to make the scoring process more objective, equitable, and generalizable. The development process began with a description of response characteristics for each professional noticing component disconnected from…

Descriptors: Models, Teacher Evaluation, Observation, Bias

Gauging the Auditory Dimensions of Dysarthric Impairment: Reliability and Construct Validity of the Bogenhausen Dysarthria Scales (BoDyS)

Peer reviewed

Direct link

Ziegler, Wolfram; Staiger, Anja; Schölderle, Theresa; Vogel, Mathias – Journal of Speech, Language, and Hearing Research, 2017

Purpose: Standardized clinical assessment of dysarthria is essential for management and research. We present a new, fully standardized dysarthria assessment, the Bogenhausen Dysarthria Scales (BoDyS). The measurement model of the BoDyS is based on auditory evaluations of connected speech using 9 scales (traits) assessed by 4 elicitation methods.…

Descriptors: Auditory Evaluation, Test Reliability, Test Validity, Rating Scales

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10

Educational and Psychological…	5
ETS Research Report Series	4
Journal of Educational…	4
Language Testing	4
Journal of Educational and…	3
ProQuest LLC	3
American Journal of Distance…	2
Applied Psychological…	2
Child Development	2
Creativity Research Journal	2
Current Issues in Education	2
Early Childhood Research…	2
Education and Treatment of…	2
Grantee Submission	2
International Educational…	2
Journal of Science Education…	2
Journal of Technology…	2
Society for Research on…	2
AERA Online Paper Repository	1
Applied Measurement in…	1
Asia-Pacific Journal of…	1
Assessment	1
Assessment in Education:…	1
Australasian Journal of…	1
Autism: The International…	1
More ▼

Cason, Carolyn L.	3
Wang, Wen-Chung	3
Casabianca, Jodi M.	2
Goe, Laura	2
Holdheide, Lynn	2
Miller, Tricia	2
Ramineni, Chaitanya	2
Raymond, Mark R.	2
Schuster, Christof	2
Trapani, Catherine S.	2
Williamson, David M.	2
Wind, Stefanie A.	2
Abedi, Jamal	1
Adams, R. J.	1
Agungbudiprabowo	1
Ahmet Guven	1
Ai, Wenguo	1
Albert, Adelin	1
Aleong, Chandra	1
Algozzine, Kate M.	1
Algozzine, Robert F.	1
Aljunied, Mariam	1
Allen, Jeff M.	1
Allik, Juri	1
More ▼