NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
What Works Clearinghouse Rating
Does not meet standards1
Showing 1 to 15 of 144 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Lottridge, Susan; Woolf, Sherri; Young, Mackenzie; Jafari, Amir; Ormerod, Chris – Journal of Computer Assisted Learning, 2023
Background: Deep learning methods, where models do not use explicit features and instead rely on implicit features estimated during model training, suffer from an explainability problem. In text classification, saliency maps that reflect the importance of words in prediction are one approach toward explainability. However, little is known about…
Descriptors: Documentation, Learning Strategies, Models, Prediction
Feldberg, Zachary R. – ProQuest LLC, 2023
Cognitive diagnostic models (CDMs) provide pedagogically relevant information in the form of a student profile of multiple binary categorizations of students into mastery or nonmastery statuses on latent traits called attributes. Federal educational accountability requires accountability measures to designate students into one of at least three…
Descriptors: Accountability, Standards, Cutting Scores, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Gilstrap, Donald L.; Whitver, Sara Maurice; Scalfani, Vincent F.; Bray, Nathaniel J. – Innovative Higher Education, 2023
This article explores how well bibliometrics and altmetrics reflect research impact in relation to Boyer's Model of the Scholarship. Indices used for both types of metrics are explored and discussed while including an analysis on primary methodological works performed on each in the literature to date. As confirmatory in nature, we chose as our…
Descriptors: Bibliometrics, Models, Scholarship, Research
Peer reviewed Peer reviewed
Direct linkDirect link
Lamprianou, Iasonas – Sociological Methods & Research, 2023
This study investigates inter- and intracoder reliability, proposing a new approach based on social network analysis (SNA) and exponential random graph models (ERGM). During a recent exit poll, the responses of voters to two open-ended questions were recorded. A coding experiment was conducted where a group of coders coded a sample of text…
Descriptors: Interrater Reliability, Coding, Social Networks, Network Analysis
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022
How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…
Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lulu Desia Mutiani Rahmayuni; Siti Sriyatib; Diah Kusumawaty – Journal of Biological Education Indonesia (Jurnal Pendidikan Biologi Indonesia), 2024
Business Model Canvas (BMC) is a business model that must be mastered by students in the Bioentrepreneurship course as an initial provision for entering the entrepreneurial world, while in compiling Business Model Canvas (BMC) systematic thinking skills are needed. This study aims to provide an assessment instrument to measure students' system…
Descriptors: Systems Approach, Thinking Skills, Models, Business Administration
Nnamdi Chika Ezike – ProQuest LLC, 2022
Fitting wrongly specified models to observed data may lead to invalid inferences about the model parameters of interest. The current study investigated the performance of the posterior predictive model checking (PPMC) approach in detecting model-data misfit of the hierarchical rater model (HRM). The HRM is a rater-mediated model that incorporates…
Descriptors: Prediction, Models, Interrater Reliability, Item Response Theory
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Doewes, Afrizal; Kurdhi, Nughthoh Arfawi; Saxena, Akrati – International Educational Data Mining Society, 2023
Automated Essay Scoring (AES) tools aim to improve the efficiency and consistency of essay scoring by using machine learning algorithms. In the existing research work on this topic, most researchers agree that human-automated score agreement remains the benchmark for assessing the accuracy of machine-generated scores. To measure the performance of…
Descriptors: Essays, Writing Evaluation, Evaluators, Accuracy
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Jönsson, Anders; Balan, Andreia – Practical Assessment, Research & Evaluation, 2018
Research on teachers' grading has shown that there is great variability among teachers regarding both the process and product of grading, resulting in low comparability and issues of inequality when using grades for selection purposes. Despite this situation, not much is known about the merits or disadvantages of different models for grading. In…
Descriptors: Grading, Models, Reliability, Validity
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Purwadi; Saputra, Wahyu N. E.; Handaka, Irvan B.; Barida, Muya; Wahyudi, Amien; Widyastuti, Dian A.; Agungbudiprabowo; Rodhiya, Zaenab A. – Pegem Journal of Education and Instruction, 2022
This study aims to identify the acceptability and effectiveness of peace guidance based on the perspective of Markesot. This model seeks to reduce student aggressiveness. This study uses the research and development stages by adapting the Borg & Gall model. The participants of this study were 275 students who were taken randomly. The study…
Descriptors: Peace, Guidance, Models, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Wesolowski, Brian C.; Wind, Stefanie A. – Journal of Educational Measurement, 2019
Rater-mediated assessments are a common methodology for measuring persons, investigating rater behavior, and/or defining latent constructs. The purpose of this article is to provide a pedagogical framework for examining rater variability in the context of rater-mediated assessments using three distinct models. The first model is the observation…
Descriptors: Interrater Reliability, Models, Observation, Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Jin, Kuan-Yu; Wang, Wen-Chung – Journal of Educational Measurement, 2018
The Rasch facets model was developed to account for facet data, such as student essays graded by raters, but it accounts for only one kind of rater effect (severity). In practice, raters may exhibit various tendencies such as using middle or extreme scores in their ratings, which is referred to as the rater centrality/extremity response style. To…
Descriptors: Scoring, Models, Interrater Reliability, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Nieto, Ricardo; Casabianca, Jodi M. – Journal of Educational Measurement, 2019
Many large-scale assessments are designed to yield two or more scores for an individual by administering multiple sections measuring different but related skills. Multidimensional tests, or more specifically, simple structured tests, such as these rely on multiple multiple-choice and/or constructed responses sections of items to generate multiple…
Descriptors: Tests, Scoring, Responses, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Schack, Edna O.; Dueber, David; Thomas, Jonathan Norris; Fisher, Molly H.; Jong, Cindy – AERA Online Paper Repository, 2019
Scoring of teachers' noticing responses is typically burdened with rater bias and reliance upon interrater consensus. The authors sought to make the scoring process more objective, equitable, and generalizable. The development process began with a description of response characteristics for each professional noticing component disconnected from…
Descriptors: Models, Teacher Evaluation, Observation, Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Ziegler, Wolfram; Staiger, Anja; Schölderle, Theresa; Vogel, Mathias – Journal of Speech, Language, and Hearing Research, 2017
Purpose: Standardized clinical assessment of dysarthria is essential for management and research. We present a new, fully standardized dysarthria assessment, the Bogenhausen Dysarthria Scales (BoDyS). The measurement model of the BoDyS is based on auditory evaluations of connected speech using 9 scales (traits) assessed by 4 elicitation methods.…
Descriptors: Auditory Evaluation, Test Reliability, Test Validity, Rating Scales
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10