Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 19 |
Descriptor
Testing | 22 |
Measurement | 11 |
Scores | 9 |
Psychometrics | 8 |
Definitions | 7 |
Test Items | 7 |
Validity | 7 |
Test Validity | 6 |
Tests | 6 |
Evaluation Methods | 5 |
Evaluation Problems | 5 |
More ▼ |
Source
Measurement:… | 22 |
Author
Embretson, Susan E. | 2 |
Engelhard, George, Jr. | 2 |
Wind, Stefanie A. | 2 |
Behizadeh, Nadia | 1 |
Borsboom, Denny | 1 |
Braun, Henry | 1 |
Camara, Wayne J. | 1 |
Feuer, Michael J. | 1 |
Garner, Mary | 1 |
Gee, James Paul | 1 |
Haertel, Edward H. | 1 |
More ▼ |
Publication Type
Journal Articles | 22 |
Opinion Papers | 17 |
Reports - Descriptive | 3 |
Reports - Evaluative | 3 |
Reports - Research | 1 |
Education Level
Higher Education | 2 |
Elementary Secondary Education | 1 |
Postsecondary Education | 1 |
Audience
Practitioners | 1 |
Location
Germany | 1 |
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Schumacker, Randall E.; Wind, Stefanie A.; Holmes, Lauren F. – Measurement: Interdisciplinary Research and Perspectives, 2021
A variety of resources are available from which researchers can identify measurement instruments, including peer-reviewed journal articles, collections of technical information about published instruments, and electronic databases that are sponsored by universities, testing organizations, and other groups. Although these resources are widespread,…
Descriptors: Measurement Techniques, Journal Articles, Databases, Testing
Peabody, Michael R.; Wind, Stefanie A. – Measurement: Interdisciplinary Research and Perspectives, 2019
Differential Item Functioning (DIF) detection procedures provide validity evidence for proposed interpretations of test scores that can help researchers and practitioners ensure that test scores are free from potential bias, and that individual items do not create an advantage for any subgroup of examinees over another. In this study, we use the…
Descriptors: Item Response Theory, Test Items, Scores, Testing
Peterson, Eric; Welsh, Marilyn C. – Measurement: Interdisciplinary Research and Perspectives, 2014
Research into executive functioning (EF) has indeed grown exponentially across the past few decades, but as the Willoughby et al. critique makes clear, there remain fundamental questions to be resolved. The crux of their argument is built upon an examination of the confirmatory factor analysis (CFA) approach to understanding executive processes.…
Descriptors: Executive Function, Measurement, Factor Analysis, Reliability
Garner, Mary – Measurement: Interdisciplinary Research and Perspectives, 2013
In "How Is Testing Supposed to Improve Schooling," Haertel describes seven broad mechanisms whereby testing is used to improve schooling (this issue). The first four are direct mechanisms, meaning that "test scores are taken as indicators of some underlying construct and on that basis scores are used to guide some decision or draw some…
Descriptors: Testing, Early Intervention, Educational Improvement, Change Strategies
Lane, Suzanne – Measurement: Interdisciplinary Research and Perspectives, 2012
Considering consequences in the evaluation of validity is not new although it is still debated by Paul E. Newton and others. The argument-based approach to validity entails an interpretative argument that explicitly identifies the proposed interpretations and uses of test scores and a validity argument that provides a structure for evaluating the…
Descriptors: Educational Opportunities, Accountability, Validity, Inferences
Behizadeh, Nadia; Engelhard, George, Jr. – Measurement: Interdisciplinary Research and Perspectives, 2015
In his focus article, Koretz (this issue) argues that accountability has become the primary function of large-scale testing in the United States. He then points out that tests being used for accountability purposes are flawed and that the high-stakes nature of these tests creates a context that encourages score inflation. Koretz is concerned about…
Descriptors: Communities of Practice, High Stakes Tests, Testing, Test Validity
Pollitt, Alastair – Measurement: Interdisciplinary Research and Perspectives, 2012
Paul E. Newton's article is valuable in many ways, especially for clarifying confusions and inconsistencies in the assessment business. Most importantly, he points out confusions that persist and where open discussion will help us understand what we say and what we mean to say. But I will focus here on the only faults I find in the article: three…
Descriptors: Validity, Evaluation, Definitions, Test Construction
Borsboom, Denny – Measurement: Interdisciplinary Research and Perspectives, 2012
Paul E. Newton provides an insightful and scholarly overview of central issues in validity theory. As he notes, many of the conceptual problems in validity theory derive from the fact that the word "validity" has two meanings. First, it indicates "whether a test measures what it purports to measure." This is a factual claim about the psychometric…
Descriptors: Validity, Psychometrics, Test Interpretation, Scores
Braun, Henry – Measurement: Interdisciplinary Research and Perspectives, 2012
Paul E. Newton is to be commended for addressing as challenging a topic as the clarification of the concept of validity. The impetus for this foray is Newton's judgment that, despite decades of development, the definition and elaboration of the term test validity in the 1999 "Standards" retains sufficient ambiguity to permit, if not invite, both…
Descriptors: Educational Improvement, Test Validity, Validity, Tests
Haig, Brian D. – Measurement: Interdisciplinary Research and Perspectives, 2012
Lee Cronbach once expressed the view that all roads lead to construct validity. In looking to clarify the consensus definition of validity, and its place in assessment, Newton is also led to the troublesome idea of construct validity. To be sure, he addresses other validity issues, but in this commentary, I will restrict my attention to construct…
Descriptors: Validity, Educational Assessment, Construct Validity, Definitions
Kane, Michael – Measurement: Interdisciplinary Research and Perspectives, 2012
Paul E. Newton's article on the consensus definition of validity tackles a number of big issues and makes a number of strong claims. I agreed with much of what he said, and I disagreed with a number of his claims, but I found his article to be consistently interesting and thought provoking (whether I agreed or not). I will focus on three general…
Descriptors: Validity, Construct Validity, Tests, Testing
Mattern, Krista D.; Kobrin, Jennifer L.; Camara, Wayne J. – Measurement: Interdisciplinary Research and Perspectives, 2012
As researchers at a testing organization concerned with the appropriate uses and validity evidence for our assessments, we provide an applied perspective related to the issues raised in the focus article. Newton's proposal for elaborating the consensus definition of validity is offered with the intention to reduce the risks of inadequate…
Descriptors: Evidence, Validity, Tests, Testing
Feuer, Michael J. – Measurement: Interdisciplinary Research and Perspectives, 2010
This article presents lessons learned from a story of a snowy and dangerous intersection where there was no way for pedestrians to cross. The basic theme of this paper is that if political economy is preoccupied largely with the measurement of externalities, then a goal for the testing and assessment policy community should be to devise strategies…
Descriptors: Testing, Measurement, Educational Assessment, Accountability
Engelhard, George, Jr.; Perkins, Aminah F. – Measurement: Interdisciplinary Research and Perspectives, 2011
Humphry (this issue) has written a thought-provoking piece on the interpretation of item discrimination parameters as scale units in item response theory. One of the key features of his work is the description of an item response theory (IRT) model that he calls the logistic measurement function that combines aspects of two traditions in IRT that…
Descriptors: Foreign Countries, Social Sciences, Item Response Theory, Testing
von Davier, Matthias – Measurement: Interdisciplinary Research and Perspectives, 2009
In this commentary, the author points out few issues, one being that there are models mislabeled as diagnostic, which deal with linear decompositions of item difficulties rather than estimating multidimensional skill variables. The author discusses the issue that there are many new names for essentially well-known models for multiple simultaneous…
Descriptors: Test Items, Probability, Models, Diagnostic Tests
Previous Page | Next Page ยป
Pages: 1 | 2