NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20253
Since 20245
Since 2021 (last 5 years)23
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 23 results Save | Export
Akif Avcu – Malaysian Online Journal of Educational Technology, 2025
This scope-review presents the milestones of how Hierarchical Rater Models (HRMs) become operable to used in automated essay scoring (AES) to improve instructional evaluation. Although essay evaluations--a useful instrument for evaluating higher-order cognitive abilities--have always depended on human raters, concerns regarding rater bias,…
Descriptors: Automation, Scoring, Models, Educational Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Boris Forthmann; Benjamin Goecke; Roger E. Beaty – Creativity Research Journal, 2025
Human ratings are ubiquitous in creativity research. Yet, the process of rating responses to creativity tasks -- typically several hundred or thousands of responses, per rater -- is often time-consuming and expensive. Planned missing data designs, where raters only rate a subset of the total number of responses, have been recently proposed as one…
Descriptors: Creativity, Research, Researchers, Research Methodology
Peer reviewed Peer reviewed
Direct linkDirect link
Tong Wu; Stella Y. Kim; Carl Westine; Michelle Boyer – Journal of Educational Measurement, 2025
While significant attention has been given to test equating to ensure score comparability, limited research has explored equating methods for rater-mediated assessments, where human raters inherently introduce error. If not properly addressed, these errors can undermine score interchangeability and test validity. This study proposes an equating…
Descriptors: Item Response Theory, Evaluators, Error of Measurement, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Jue; Engelhard, George; Combs, Trenton – Journal of Experimental Education, 2023
Unfolding models are frequently used to develop scales for measuring attitudes. Recently, unfolding models have been applied to examine rater severity and accuracy within the context of rater-mediated assessments. One of the problems in applying unfolding models to rater-mediated assessments is that the substantive interpretations of the latent…
Descriptors: Writing Evaluation, Scoring, Accuracy, Computational Linguistics
Peer reviewed Peer reviewed
Direct linkDirect link
Song, Yoon Ah; Lee, Won-Chan – Applied Measurement in Education, 2022
This article presents the performance of item response theory (IRT) models when double ratings are used as item scores over single ratings when rater effects are present. Study 1 examined the influence of the number of ratings on the accuracy of proficiency estimation in the generalized partial credit model (GPCM). Study 2 compared the accuracy of…
Descriptors: Item Response Theory, Item Analysis, Scores, Accuracy
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Doewes, Afrizal; Kurdhi, Nughthoh Arfawi; Saxena, Akrati – International Educational Data Mining Society, 2023
Automated Essay Scoring (AES) tools aim to improve the efficiency and consistency of essay scoring by using machine learning algorithms. In the existing research work on this topic, most researchers agree that human-automated score agreement remains the benchmark for assessing the accuracy of machine-generated scores. To measure the performance of…
Descriptors: Essays, Writing Evaluation, Evaluators, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Nakayama, Minoru; Sciarrone, Filippo; Temperini, Marco; Uto, Masaki – International Journal of Distance Education Technologies, 2022
Massive open on-line courses (MOOCs) are effective and flexible resources to educate, train, and empower populations. Peer assessment (PA) provides a powerful pedagogical strategy to support educational activities and foster learners' success, also where a huge number of learners is involved. Item response theory (IRT) can model students'…
Descriptors: Item Response Theory, Peer Evaluation, MOOCs, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Denis Dumas; James C. Kaufman – Educational Psychology Review, 2024
Who should evaluate the originality and task-appropriateness of a given idea has been a perennial debate among psychologists of creativity. Here, we argue that the most relevant evaluator of a given idea depends crucially on the level of expertise of the person who generated it. To build this argument, we draw on two complimentary theoretical…
Descriptors: Decision Making, Creativity, Task Analysis, Psychologists
Renato Britto Ferreira – ProQuest LLC, 2024
Modern-day educators often share a sense they lack voice and agency with school administration. A classic example is curriculum development, where third-party designers develop uncontextualized curricula, and teachers then must implement the design even if inefficient and ineffective. What happens if this identical situation occurs at the program…
Descriptors: Art Education, School Administration, Curriculum Design, Program Design
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Zhang, Mengxue; Heffernan, Neil; Lan, Andrew – International Educational Data Mining Society, 2023
Automated scoring of student responses to open-ended questions, including short-answer questions, has great potential to scale to a large number of responses. Recent approaches for automated scoring rely on supervised learning, i.e., training classifiers or fine-tuning language models on a small number of responses with human-provided score…
Descriptors: Scoring, Computer Assisted Testing, Mathematics Instruction, Mathematics Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Hung, Su-Pin; Huang, Hung-Yu – Journal of Educational and Behavioral Statistics, 2022
To address response style or bias in rating scales, forced-choice items are often used to request that respondents rank their attitudes or preferences among a limited set of options. The rating scales used by raters to render judgments on ratees' performance also contribute to rater bias or errors; consequently, forced-choice items have recently…
Descriptors: Evaluation Methods, Rating Scales, Item Analysis, Preferences
Peer reviewed Peer reviewed
Direct linkDirect link
Garman, Andrew N.; Erwin, Taylor S.; Garman, Tyler R.; Kim, Dae Hyun – Journal of Competency-Based Education, 2021
Background: Competency models provide useful frameworks for organizing learning and assessment programs, but their construction is both time intensive and subject to perceptual biases. Some aspects of model development may be particularly well-suited to automation, specifically natural language processing (NLP), which could also help make them…
Descriptors: Natural Language Processing, Automation, Guidelines, Leadership Effectiveness
Peer reviewed Peer reviewed
Direct linkDirect link
Shin, Jinnie; Gierl, Mark J. – Language Testing, 2021
Automated essay scoring (AES) has emerged as a secondary or as a sole marker for many high-stakes educational assessments, in native and non-native testing, owing to remarkable advances in feature engineering using natural language processing, machine learning, and deep-neural algorithms. The purpose of this study is to compare the effectiveness…
Descriptors: Scoring, Essays, Writing Evaluation, Computer Software
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Bradley-Levine, Jill – International Journal of Teacher Leadership, 2022
This article shares the findings of a qualitative case study examining the experiences of teacher leaders as they engaged as teacher evaluators alongside school principals. Data collection included observations and interviews with four teacher leaders and three school principals to answer these research questions: (1) What TDEM structures have…
Descriptors: Teacher Leadership, Teacher Evaluation, Teacher Participation, Principals
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lluch Molins, Laia; Cano García, Elena – Journal of New Approaches in Educational Research, 2023
One of the main generic competencies in Higher Education is "Learning to Learn". The key component of this competence is the capacity for self-regulated learning (SRL). For this competence to be developed, peer feedback seems useful because it fosters evaluative judgement. Following the principles of peer feedback processes, an online…
Descriptors: Learning Analytics, Learning Management Systems, Peer Evaluation, Higher Education
Previous Page | Next Page »
Pages: 1  |  2