NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1 to 15 of 141 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Kylie Gorney; Sandip Sinharay – Journal of Educational Measurement, 2025
Although there exists an extensive amount of research on subscores and their properties, limited research has been conducted on categorical subscores and their interpretations. In this paper, we focus on the claim of Feinberg and von Davier that categorical subscores are useful for remediation and instructional purposes. We investigate this claim…
Descriptors: Tests, Scores, Test Interpretation, Alternative Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Karen Blackburn Hoeve – ProQuest LLC, 2021
High stakes test-based accountability systems primarily rely on aggregates and derivatives of scores from tests that were originally developed to measure individual student mastery of content specifications. Current validity models do not explicitly address this use of aggregate scores to measure the performance of teachers, administrators, and…
Descriptors: Accountability, Test Validity, High Stakes Tests, Hierarchical Linear Modeling
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kelsey Nason; Christine E. DeMars – Research & Practice in Assessment, 2023
Universities administer assessments for accountability and program improvement. Student effort is low during assessments due to minimal perceived consequences. The effects of low effort are compounded by assessment context. This project investigates validity concerns caused by minimal effort and exacerbated by contextual factors. Systematic…
Descriptors: Test Validity, COVID-19, Pandemics, Environmental Influences
Peer reviewed Peer reviewed
Direct linkDirect link
Stephen M. Leach; Jason C. Immekus; Jeffrey C. Valentine; Prathiba Batley; Dena Dossett; Tamara Lewis; Thomas Reece – Assessment for Effective Intervention, 2025
Educators commonly use school climate survey scores to inform and evaluate interventions for equitably improving learning and reducing educational disparities. Unfortunately, validity evidence to support these (and other) score uses often falls short. In response, Whitehouse et al. proposed a collaborative, two-part validity testing framework for…
Descriptors: School Surveys, Measurement, Hierarchical Linear Modeling, Educational Environment
Peer reviewed Peer reviewed
Direct linkDirect link
An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022
Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…
Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Emre Zengin; Yasemin Karal – International Journal of Assessment Tools in Education, 2024
This study was carried out to develop a test to assess algorithmic thinking skills. To this end, the twelve steps suggested by Downing (2006) were adopted. Throughout the test development, 24 middle school sixth-grade students and eight experts in different areas took part as needed in the tasks on the project. The test was given to 252 students…
Descriptors: Grade 6, Algorithms, Thinking Skills, Evaluation Methods
Mattern, Krista; Radunzel, Justine – ACT, Inc., 2019
When applicants take the ACT® more than once, how do colleges and universities reconcile and make sense of the multiple scores? In terms of validity, fairness, and impact on subgroup differences, are certain score-use polices better than others? The focus of this issue brief is to summarize evidence on the validity and fairness of various…
Descriptors: Scoring, College Entrance Examinations, Test Validity, Evaluation Methods
Baraldi Cunha, Andrea; Babik, Iryna; Koziol, Natalie A.; Hsu, Lin-Ya; Nord, Jayden; Harbourne, Regina T.; Westcott-McCoy, Sarah; Dusing, Stacey C.; Bovaird, James A.; Lobo, Michele A. – Grantee Submission, 2021
Purpose: To evaluate the validity, reliability, and sensitivity of the novel Means-End Problem-Solving Assessment Tool (MEPSAT). Methods: Children with typical development and those with motor delay were assessed throughout the first 2 years of life using the MEPSAT. MEPSAT scores were validated against the cognitive and motor subscales of the…
Descriptors: Problem Solving, Early Intervention, Evaluation Methods, Motor Development
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022
In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…
Descriptors: Computer Assisted Testing, Tests, Scores, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Poehner, Matthew E.; van Compernolle, Rémi A. – Journal of Cognitive Education and Psychology, 2018
This article examines the implications of argument-based validity for the continued development of dynamic assessment (DA) research and practice. We propose that the move toward validation as a process of interpretation and evidence-based argument is commensurable with DA but that fundamental ontological differences with conventional approaches to…
Descriptors: Alternative Assessment, Evaluation Methods, Second Language Learning, Interaction
Reardon, Sean F.; Ho, Andrew D.; Kalogrides, Demetra – Stanford Center for Education Policy Analysis, 2019
Linking score scales across different tests is considered speculative and fraught, even at the aggregate level (Feuer et al., 1999; Thissen, 2007). We introduce and illustrate validation methods for aggregate linkages, using the challenge of linking U.S. school district average test scores across states as a motivating example. We show that…
Descriptors: Test Validity, Evaluation Methods, School Districts, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Morphew, Jason W.; Mestre, Jose P.; Kang, Hyeon-Ah; Chang, Hua-Hua; Fabry, Gregory – Physical Review Physics Education Research, 2018
Prior research has established that students often underprepare for midterm examinations yet remain overconfident in their proficiency. Research concerning the testing effect has demonstrated that utilizing testing as a study strategy leads to higher performance and more accurate confidence compared to more common study strategies such as…
Descriptors: Computer Assisted Testing, Physics, Science Instruction, Introductory Courses
Peer reviewed Peer reviewed
Direct linkDirect link
Wakabayashi, Tomoko; Claxton, Jill; Smith, Everett V., Jr. – Journal of Psychoeducational Assessment, 2019
The Child Observation Record (COR), initially developed in 1993 by HighScope Educational Research Foundation, is an observation-based instrument that provides systematic assessment of young children's knowledge and abilities in all major areas of development. Teachers or caregivers spend a few minutes each day writing brief notes or…
Descriptors: Observation, Evaluation Methods, Early Childhood Education, Kindergarten
Daniel Rodriguez-Segura; Beth E. Schueler – Annenberg Institute for School Reform at Brown University, 2022
School closures induced by COVID-19 placed heightened emphasis on alternative ways to measure student learning besides in-person exams. We leverage the administration of phone-based assessments (PBAs) measuring numeracy and literacy for primary school children in Kenya, along with in-person standardized tests administered to the same students…
Descriptors: Foreign Countries, School Closing, COVID-19, Pandemics
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10