NotesFAQContact Us
Collection
Advanced
Search Tips
Source
Educational Measurement:…106
Audience
Researchers1
What Works Clearinghouse Rating
Showing 1 to 15 of 106 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Russell, Michael – Educational Measurement: Issues and Practice, 2022
Despite agreement about the central importance of validity for educational and psychological testing, consensus regarding the definition of validity remains elusive. Differences in the definition of validity are examined and reveals that a potential cause of disagreement stems from differences in word use and meanings given to key terms commonly…
Descriptors: Test Validity, Psychological Testing, Educational Testing, Vocabulary
Peer reviewed Peer reviewed
Direct linkDirect link
Lewis, Jennifer; Sireci, Stephen G. – Educational Measurement: Issues and Practice, 2022
This module is designed for educators, educational researchers, and psychometricians who would like to develop an understanding of the basic concepts of validity theory, test validation, and documenting a "validity argument." It also describes how an in-depth understanding of the purposes and uses of educational tests sets the foundation…
Descriptors: Test Validity, Tests, Testing Problems, Faculty Development
Peer reviewed Peer reviewed
Direct linkDirect link
Folger, Timothy D.; Bostic, Jonathan; Krupa, Erin E. – Educational Measurement: Issues and Practice, 2023
Validity is a fundamental consideration of test development and test evaluation. The purpose of this study is to define and reify three key aspects of validity and validation, namely test-score interpretation, test-score use, and the claims supporting interpretation and use. This study employed a Delphi methodology to explore how experts in…
Descriptors: Test Interpretation, Scores, Test Use, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Coggeshall, Whitney Smiley – Educational Measurement: Issues and Practice, 2021
The continuous testing framework, where both successful and unsuccessful examinees have to demonstrate continued proficiency at frequent prespecified intervals, is a framework that is used in noncognitive assessment and is gaining in popularity in cognitive assessment. Despite the rigorous advantages of this framework, this paper demonstrates that…
Descriptors: Classification, Accuracy, Testing, Failure
Peer reviewed Peer reviewed
Direct linkDirect link
Peabody, Michael R.; Muckle, Timothy J.; Meng, Yu – Educational Measurement: Issues and Practice, 2023
The subjective aspect of standard-setting is often criticized, yet data-driven standard-setting methods are rarely applied. Therefore, we applied a mixture Rasch model approach to setting performance standards across several testing programs of various sizes and compared the results to existing passing standards derived from traditional…
Descriptors: Item Response Theory, Standard Setting, Testing, Sampling
Peer reviewed Peer reviewed
Direct linkDirect link
Tsigilis, Nikolaos; Krousorati, Katerina; Gregoriadis, Athanasios; Grammatikopoulos, Vasilis – Educational Measurement: Issues and Practice, 2023
The Preschool Early Numeracy Skills Test--Brief Version (PENS-B) is a measure of early numeracy skills, developed and mainly used in the United States. The purpose of this study was to examine the factorial validity and measurement invariance across gender of PENS-B in the Greek educational context. PENS-B was administered to 906 preschool…
Descriptors: Psychometrics, Preschool Education, Numeracy, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Guher Gorgun; Okan Bulut – Educational Measurement: Issues and Practice, 2025
Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for…
Descriptors: Artificial Intelligence, Quality Control, Technology Uses in Education, Automation
Peer reviewed Peer reviewed
Direct linkDirect link
Student, Sanford R.; Gong, Brian – Educational Measurement: Issues and Practice, 2022
We address two persistent challenges in large-scale assessments of the Next Generation Science Standards: (a) the validity of score interpretations that target the standards broadly and (b) how to structure claims for assessments of this complex domain. The NGSS pose a particular challenge for specifying claims about students that evidence from…
Descriptors: Science Tests, Test Validity, Test Items, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Ing, Marsha; Chinen, Starlie; Jackson, Kara; Smith, Thomas M. – Educational Measurement: Issues and Practice, 2021
Despite the ease of accessing a wide range of measures, little attention is given to validity arguments when considering whether to use the measure for a new purpose or in a different context. Making a validity argument has historically focused on the intended interpretation and use. There has been a press to consider both the intended and actual…
Descriptors: Instructional Improvement, Measures (Individuals), Test Validity, Test Interpretation
Peer reviewed Peer reviewed
Direct linkDirect link
Newton, Paul E. – Educational Measurement: Issues and Practice, 2020
Educational assessment involves eliciting, transmitting, and receiving information concerning the level of proficiency of a learner in a specified domain. With that in mind, it is perhaps surprising that the literature seems to make very little use of the signal processing metaphor. The present article begins by making a general case for greater…
Descriptors: Educational Assessment, Student Evaluation, Evaluative Thinking, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Barry, Carol L.; Jones, Andrew T.; Ibáñez, Beatriz; Grambau, Marni; Buyske, Jo – Educational Measurement: Issues and Practice, 2022
In response to the COVID-19 pandemic, the American Board of Surgery (ABS) shifted from in-person to remote administrations of the oral certifying exam (CE). Although the overall exam architecture remains the same, there are a number of differences in administration and staffing costs, exam content, security concerns, and the tools used to give the…
Descriptors: COVID-19, Pandemics, Computer Assisted Testing, Verbal Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Sanford R. Student; Derek C. Briggs; Laurie Davis – Educational Measurement: Issues and Practice, 2025
Vertical scales are frequently developed using common item nonequivalent group linking. In this design, one can use upper-grade, lower-grade, or mixed-grade common items to estimate the linking constants that underlie the absolute measurement of growth. Using the Rasch model and a dataset from Curriculum Associates' i-Ready Diagnostic in math in…
Descriptors: Elementary School Mathematics, Elementary School Students, Middle School Mathematics, Middle School Students
Peer reviewed Peer reviewed
Direct linkDirect link
An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022
Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…
Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies
Peer reviewed Peer reviewed
Direct linkDirect link
Angela Johnson; Elizabeth Barker; Marcos Viveros Cespedes – Educational Measurement: Issues and Practice, 2024
Educators and researchers strive to build policies and practices on data and evidence, especially on academic achievement scores. When assessment scores are inaccurate for specific student populations or when scores are inappropriately used, even data-driven decisions will be misinformed. To maximize the impact of the research-practice-policy…
Descriptors: Equal Education, Inclusion, Evaluation Methods, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Wilkerson, Judy R. – Educational Measurement: Issues and Practice, 2020
Validity and reliability are a major focus in teacher education accreditation by the Council for Accreditation of Educator Preparation (CAEP). CAEP requires the use of "accepted research standards," but many faculty and administrators are unsure how to meet this requirement. The Standards of Educational and Psychological Testing…
Descriptors: Test Construction, Test Validity, Test Reliability, Teacher Education Programs
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8