NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1 to 15 of 276 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Wendy Chan – Asia Pacific Education Review, 2024
As evidence from evaluation and experimental studies continue to influence decision and policymaking, applied researchers and practitioners require tools to derive valid and credible inferences. Over the past several decades, research in causal inference has progressed with the development and application of propensity scores. Since their…
Descriptors: Probability, Scores, Causal Models, Statistical Inference
Peer reviewed Peer reviewed
Direct linkDirect link
Van Meenen, Florence; Coertjens, Liesje; Van Nes, Marie-Claire; Verschuren, Franck – Advances in Health Sciences Education, 2022
The present study explores two rating methods for peer assessment (analytical rating using criteria and comparative judgement) in light of concurrent validity, reliability and insufficient diagnosticity (i.e. the degree to which substandard work is recognised by the peer raters). During a second-year undergraduate course, students wrote a one-page…
Descriptors: Evaluation Methods, Peer Evaluation, Accuracy, Evaluation Criteria
Peer reviewed Peer reviewed
Direct linkDirect link
Hyemin Yoon; HyunJin Kim; Sangjin Kim – Measurement: Interdisciplinary Research and Perspectives, 2024
We have maintained the customer grade system that is being implemented to customers with excellent performance through customer segmentation for years. Currently, financial institutions that operate the customer grade system provide similar services based on the score calculation criteria, but the score calculation criteria vary from the financial…
Descriptors: Classification, Artificial Intelligence, Prediction, Decision Making
Peer reviewed Peer reviewed
Direct linkDirect link
Kylie Gorney; Sandip Sinharay – Journal of Educational Measurement, 2025
Although there exists an extensive amount of research on subscores and their properties, limited research has been conducted on categorical subscores and their interpretations. In this paper, we focus on the claim of Feinberg and von Davier that categorical subscores are useful for remediation and instructional purposes. We investigate this claim…
Descriptors: Tests, Scores, Test Interpretation, Alternative Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Cleary, Timothy J.; Slemp, Jackie; Reddy, Linda A.; Alperin, Alexander; Lui, Angela; Austin, Amanda; Cedar, Tori – School Psychology Review, 2023
The primary purpose of this study was to systematically review the literature regarding the characteristics, use, and implementation of an emerging assessment methodology, "SRL microanalysis." Forty-two studies across diverse samples, contexts, and research methodologies met inclusion criteria. The majority of studies used microanalysis…
Descriptors: School Psychology, School Psychologists, Evaluation Methods, Metacognition
Peer reviewed Peer reviewed
Direct linkDirect link
Manuel T. Rein; Jeroen K. Vermunt; Kim De Roover; Leonie V. D. E. Vogelsmeier – Structural Equation Modeling: A Multidisciplinary Journal, 2025
Researchers often study dynamic processes of latent variables in everyday life, such as the interplay of positive and negative affect over time. An intuitive approach is to first estimate the measurement model of the latent variables, then compute factor scores, and finally use these factor scores as observed scores in vector autoregressive…
Descriptors: Measurement Techniques, Factor Analysis, Scores, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Jamaal L. Moore; Zhihui Yi; Jessica M. Hinman; Becky F. Barron; Mark R. Dixon – Journal of Developmental and Physical Disabilities, 2021
The current study examined the convergent validity between the standardized PEAK Comprehensive Assessment (PCA) and the semi-standardized PEAK Pre-assessment (PEAK-PA). Twenty-two participants were administered each tool, and an item by item analysis was conducted to evaluate correlations between tests. The results suggested a strong positive…
Descriptors: Validity, Evaluation Methods, Standardized Tests, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Lai, Jennifer W. M.; Bower, Matt; De Nobile, John; Breyer, Yvonne – Journal of Computer Assisted Learning, 2022
Background: There is a lack of critical or empirical work interrogating the nature and purpose of evaluating technology use in education. Objectives: In this study, we examine the values underpinning the evaluation of technology use in education through field specialist perceptions. The study also poses critical reflections about the rigour of…
Descriptors: Technology Uses in Education, Educational Technology, Program Evaluation, Content Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Karen Blackburn Hoeve – ProQuest LLC, 2021
High stakes test-based accountability systems primarily rely on aggregates and derivatives of scores from tests that were originally developed to measure individual student mastery of content specifications. Current validity models do not explicitly address this use of aggregate scores to measure the performance of teachers, administrators, and…
Descriptors: Accountability, Test Validity, High Stakes Tests, Hierarchical Linear Modeling
Peer reviewed Peer reviewed
Direct linkDirect link
Roduta Roberts, Mary; Gotch, Chad M.; Cook, Megan; Werther, Karin; Chao, Iris C. I. – Measurement: Interdisciplinary Research and Perspectives, 2022
Performance-based assessment is a common approach to assess the development and acquisition of practice competencies among health professions students. Judgments related to the quality of performance are typically operationalized as ratings against success criteria specified within a rubric. The extent to which the rubric is understood,…
Descriptors: Protocol Analysis, Scoring Rubrics, Interviews, Performance Based Assessment
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kelsey Nason; Christine E. DeMars – Research & Practice in Assessment, 2023
Universities administer assessments for accountability and program improvement. Student effort is low during assessments due to minimal perceived consequences. The effects of low effort are compounded by assessment context. This project investigates validity concerns caused by minimal effort and exacerbated by contextual factors. Systematic…
Descriptors: Test Validity, COVID-19, Pandemics, Environmental Influences
Peer reviewed Peer reviewed
Direct linkDirect link
Stephen M. Leach; Jason C. Immekus; Jeffrey C. Valentine; Prathiba Batley; Dena Dossett; Tamara Lewis; Thomas Reece – Assessment for Effective Intervention, 2025
Educators commonly use school climate survey scores to inform and evaluate interventions for equitably improving learning and reducing educational disparities. Unfortunately, validity evidence to support these (and other) score uses often falls short. In response, Whitehouse et al. proposed a collaborative, two-part validity testing framework for…
Descriptors: School Surveys, Measurement, Hierarchical Linear Modeling, Educational Environment
Peer reviewed Peer reviewed
Direct linkDirect link
An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022
Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…
Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Emre Zengin; Yasemin Karal – International Journal of Assessment Tools in Education, 2024
This study was carried out to develop a test to assess algorithmic thinking skills. To this end, the twelve steps suggested by Downing (2006) were adopted. Throughout the test development, 24 middle school sixth-grade students and eight experts in different areas took part as needed in the tasks on the project. The test was given to 252 students…
Descriptors: Grade 6, Algorithms, Thinking Skills, Evaluation Methods
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  19