Publication Date
In 2025 | 3 |
Since 2024 | 7 |
Since 2021 (last 5 years) | 22 |
Since 2016 (last 10 years) | 60 |
Since 2006 (last 20 years) | 95 |
Descriptor
Test Validity | 106 |
Validity | 71 |
Test Construction | 43 |
Test Use | 41 |
Elementary Secondary Education | 39 |
Educational Assessment | 32 |
Testing Problems | 32 |
Evaluation Methods | 30 |
Scores | 30 |
Test Items | 29 |
Standards | 25 |
More ▼ |
Source
Educational Measurement:… | 188 |
Author
Sireci, Stephen G. | 7 |
Kuncel, Nathan R. | 4 |
Linn, Robert L. | 4 |
Mehrens, William A. | 4 |
Moss, Pamela A. | 4 |
Abedi, Jamal | 3 |
Frisbie, David A. | 3 |
Lane, Suzanne | 3 |
Rudner, Lawrence M. | 3 |
Sackett, Paul R. | 3 |
Shepard, Lorrie A. | 3 |
More ▼ |
Publication Type
Education Level
Audience
Teachers | 2 |
Researchers | 1 |
Location
California | 2 |
Canada | 2 |
Germany | 2 |
Greece | 1 |
Idaho | 1 |
Indiana | 1 |
Israel | 1 |
Kansas | 1 |
Kentucky | 1 |
Michigan | 1 |
New York (New York) | 1 |
More ▼ |
Laws, Policies, & Programs
Debra P v Turlington | 4 |
No Child Left Behind Act 2001 | 3 |
Civil Rights Act 1964 Title… | 2 |
Every Student Succeeds Act… | 2 |
Fourteenth Amendment | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Russell, Michael – Educational Measurement: Issues and Practice, 2022
Despite agreement about the central importance of validity for educational and psychological testing, consensus regarding the definition of validity remains elusive. Differences in the definition of validity are examined and reveals that a potential cause of disagreement stems from differences in word use and meanings given to key terms commonly…
Descriptors: Test Validity, Psychological Testing, Educational Testing, Vocabulary
Lewis, Jennifer; Sireci, Stephen G. – Educational Measurement: Issues and Practice, 2022
This module is designed for educators, educational researchers, and psychometricians who would like to develop an understanding of the basic concepts of validity theory, test validation, and documenting a "validity argument." It also describes how an in-depth understanding of the purposes and uses of educational tests sets the foundation…
Descriptors: Test Validity, Tests, Testing Problems, Faculty Development
Folger, Timothy D.; Bostic, Jonathan; Krupa, Erin E. – Educational Measurement: Issues and Practice, 2023
Validity is a fundamental consideration of test development and test evaluation. The purpose of this study is to define and reify three key aspects of validity and validation, namely test-score interpretation, test-score use, and the claims supporting interpretation and use. This study employed a Delphi methodology to explore how experts in…
Descriptors: Test Interpretation, Scores, Test Use, Test Validity
Pellegrino, James W. – Educational Measurement: Issues and Practice, 2020
Professor Gordon argues for a significant reorientation in the focus and impact of assessment in education. For the types of assessment activities that he advocates to prosper and positively impact education, serious attention must be paid to two important topics: (1) the conceptual underpinnings of the assessment practices we develop and use to…
Descriptors: Educational Assessment, Teaching Methods, Learning Processes, Validity
Jiangang Hao; Alina A. von Davier; Victoria Yaneva; Susan Lottridge; Matthias von Davier; Deborah J. Harris – Educational Measurement: Issues and Practice, 2024
The remarkable strides in artificial intelligence (AI), exemplified by ChatGPT, have unveiled a wealth of opportunities and challenges in assessment. Applying cutting-edge large language models (LLMs) and generative AI to assessment holds great promise in boosting efficiency, mitigating bias, and facilitating customized evaluations. Conversely,…
Descriptors: Evaluation Methods, Artificial Intelligence, Educational Change, Computer Software
Daniel Murphy; Sarah Quesen; Matthew Brunetti; Quintin Love – Educational Measurement: Issues and Practice, 2024
Categorical growth models describe examinee growth in terms of performance-level category transitions, which implies that some percentage of examinees will be misclassified. This paper introduces a new procedure for estimating the classification accuracy of categorical growth models, based on Rudner's classification accuracy index for item…
Descriptors: Classification, Growth Models, Accuracy, Performance Based Assessment
Rios, Joseph A.; Ihlenfeldt, Samuel D.; Dosedel, Michael; Riegelman, Amy – Educational Measurement: Issues and Practice, 2020
This systematic review investigated the topics studied and reporting practices of published meta-analyses in educational measurement. Our findings indicated that meta-analysis is not a highly utilized methodological tool in educational measurement; on average, less than one meta-analysis has been published per year over the past 30 years (28…
Descriptors: Meta Analysis, Educational Assessment, Test Format, Testing Accommodations
Coggeshall, Whitney Smiley – Educational Measurement: Issues and Practice, 2021
The continuous testing framework, where both successful and unsuccessful examinees have to demonstrate continued proficiency at frequent prespecified intervals, is a framework that is used in noncognitive assessment and is gaining in popularity in cognitive assessment. Despite the rigorous advantages of this framework, this paper demonstrates that…
Descriptors: Classification, Accuracy, Testing, Failure
Peabody, Michael R.; Muckle, Timothy J.; Meng, Yu – Educational Measurement: Issues and Practice, 2023
The subjective aspect of standard-setting is often criticized, yet data-driven standard-setting methods are rarely applied. Therefore, we applied a mixture Rasch model approach to setting performance standards across several testing programs of various sizes and compared the results to existing passing standards derived from traditional…
Descriptors: Item Response Theory, Standard Setting, Testing, Sampling
Tsigilis, Nikolaos; Krousorati, Katerina; Gregoriadis, Athanasios; Grammatikopoulos, Vasilis – Educational Measurement: Issues and Practice, 2023
The Preschool Early Numeracy Skills Test--Brief Version (PENS-B) is a measure of early numeracy skills, developed and mainly used in the United States. The purpose of this study was to examine the factorial validity and measurement invariance across gender of PENS-B in the Greek educational context. PENS-B was administered to 906 preschool…
Descriptors: Psychometrics, Preschool Education, Numeracy, Item Response Theory
Terry A. Ackerman; Deborah L. Bandalos; Derek C. Briggs; Howard T. Everson; Andrew D. Ho; Susan M. Lottridge; Matthew J. Madison; Sandip Sinharay; Michael C. Rodriguez; Michael Russell; Alina A. Davier; Stefanie A. Wind – Educational Measurement: Issues and Practice, 2024
This article presents the consensus of an National Council on Measurement in Education Presidential Task Force on Foundational Competencies in Educational Measurement. Foundational competencies are those that support future development of additional professional and disciplinary competencies. The authors develop a framework for foundational…
Descriptors: Educational Assessment, Competence, Skill Development, Communication Skills
Lewis, Jennifer; Lim, Hwanggyu; Padellaro, Frank; Sireci, Stephen G.; Zenisky, April L. – Educational Measurement: Issues and Practice, 2022
Setting cut scores on (MSTs) is difficult, particularly when the test spans several grade levels, and the selection of items from MST panels must reflect the operational test specifications. In this study, we describe, illustrate, and evaluate three methods for mapping panelists' Angoff ratings into cut scores on the scale underlying an MST. The…
Descriptors: Cutting Scores, Adaptive Testing, Test Items, Item Analysis
Katherine E. Castellano; Daniel F. McCaffrey; Joseph A. Martineau – Educational Measurement: Issues and Practice, 2025
Growth-to-standard models evaluate student growth against the growth needed to reach a future standard or target of interest, such as proficiency. A common growth-to-standard model involves comparing the popular Student Growth Percentile (SGP) to Adequate Growth Percentiles (AGPs). AGPs follow from an involved process based on fitting a series of…
Descriptors: Student Evaluation, Growth Models, Student Educational Objectives, Educational Indicators
Guher Gorgun; Okan Bulut – Educational Measurement: Issues and Practice, 2025
Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for…
Descriptors: Artificial Intelligence, Quality Control, Technology Uses in Education, Automation
Student, Sanford R.; Gong, Brian – Educational Measurement: Issues and Practice, 2022
We address two persistent challenges in large-scale assessments of the Next Generation Science Standards: (a) the validity of score interpretations that target the standards broadly and (b) how to structure claims for assessments of this complex domain. The NGSS pose a particular challenge for specifying claims about students that evidence from…
Descriptors: Science Tests, Test Validity, Test Items, Test Construction