ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	14

Descriptor

Test Interpretation	15
Educational Testing	9
Test Use	9
Test Validity	7
Definitions	6
Educational Assessment	6
Equated Scores	6
Psychometrics	6
Comparative Analysis	5
Evaluation Methods	5
Foreign Countries	5
High Stakes Tests	5
Measurement Techniques	5
Predictive Measurement	5
Scores	5
Testing Problems	5
Classification	4
Measurement	4
Scaling	4
Test Construction	4
Test Theory	4
Standard Setting (Scoring)	3
Test Results	3
Educational Improvement	2
Maps	2
More ▼

Source

Measurement:…

Author

Haertel, Edward	2
Ames, Allison J.	1
Bachman, Lyle	1
Baird, Jo-Anne	1
Borsboom, Denny	1
Cresswell, Mike	1
Gregg, Nikole	1
Haertel, Edward H.	1
Ho, Andrew	1
Leventhal, Brian C.	1
Lorie, William A.	1
Marion, Scott F.	1
McClarty, Katie Larsen	1
Newton, Paul E.	1
Skaggs, Gary	1
Walker, Michael E.	1
von Davier, Alina A.	1
More ▼

Publication Type

Journal Articles	15
Opinion Papers	12
Reports - Evaluative	4
Reports - Descriptive	1
Reports - Research	1

Education Level

Elementary Secondary Education

Audience

Location

United Kingdom (England)	3
United States	3
United Kingdom	2
United Kingdom (Wales)	2
Australia	1

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…	2
SAT (College Admission Test)	2

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Accounting for Response Styles: Leveraging the Benefits of Combining Response Process Data Collection and Response Process Analysis Methods

Peer reviewed

Direct link

Leventhal, Brian C.; Gregg, Nikole; Ames, Allison J. – Measurement: Interdisciplinary Research and Perspectives, 2022

Response styles introduce construct-irrelevant variance as a result of respondents systematically responding to Likert-type items regardless of content. Methods to account for response styles through data analysis as well as approaches to mitigating the effects of response styles during data collection have been well-documented. Recent approaches…

Descriptors: Response Style (Tests), Item Response Theory, Test Items, Likert Scales

The Search for the Holy Grail: Content-Referenced Score Interpretations from Large-Scale Tests

Peer reviewed

Direct link

Marion, Scott F. – Measurement: Interdisciplinary Research and Perspectives, 2015

The measurement industry is in crisis. The public outcry against "over testing" and the opt-out movement are symptoms of a larger sociopolitical battle being fought over Common Core, teacher evaluation, federal intrusion, and a host of other issues, but much of the vitriol is directed at the tests and the testing industry. If we, as…

Descriptors: Test Interpretation, Scores, Educational Assessment, Measurement

Construct Maps: A Tool to Organize Validity Evidence

Peer reviewed

Direct link

McClarty, Katie Larsen – Measurement: Interdisciplinary Research and Perspectives, 2013

The construct map is a promising tool for organizing the data standard-setting panelists interpret. The challenge in applying construct maps to standard-setting procedures will be the judicious selection of data to include within this organizing framework. Therefore, this commentary focuses on decisions about what to include in the construct map.…

Descriptors: Standard Setting (Scoring), Maps, Validity, Evidence

How Is Educational Measurement Supposed to Deal with Test Use?

Peer reviewed

Direct link

Bachman, Lyle – Measurement: Interdisciplinary Research and Perspectives, 2013

At the outset of his thoughtful and thought-provoking article, Haertel (this issue) clearly identifies the issue with which he will be dealing: The disjunct, or gap, in current approaches to evaluating the merits of a given test, between the intended uses of that test and the validity of its score-based interpretations. The author thinks that…

Descriptors: Educational Testing, Test Use, Test Validity, Test Interpretation

The Epidemiology of Modern Test Score Use: Anticipating Aggregation, Adjustment, and Equating

Peer reviewed

Direct link

Ho, Andrew – Measurement: Interdisciplinary Research and Perspectives, 2013

In his thoughtful focus article, Haertel (this issue) pushes testing experts to broaden the scope of their validation efforts and to invite scholars from other disciplines to join them. He credits existing validation frameworks for helping the measurement community to identify incomplete or nonexistent validity arguments. However, he notes his…

Descriptors: Educational Testing, Scores, Test Use, Test Validity

Expanding Views of Interpretation/Use Arguments

Peer reviewed

Direct link

Haertel, Edward – Measurement: Interdisciplinary Research and Perspectives, 2013

The author is deeply gratified by the commentators' thoughtful responses and finds almost nothing to disagree with in any of them. Each offers additional insights prompting further reflection. In drawing out just a few common themes, this brief rejoinder omits many important ideas from the individual contributions. As stated in his title, the…

Descriptors: Educational Testing, Educational Improvement, Test Interpretation, Test Use

Coming Full Circle in Standard Setting: A Commentary on Wyse

Peer reviewed

Direct link

Skaggs, Gary – Measurement: Interdisciplinary Research and Perspectives, 2013

The construct map is a particularly good way to approach instrument development, and this author states that he was delighted to read Adam Wyse's thoughts about how to use construct maps for standard setting. For a number of popular standard-setting methods, Wyse shows how typical feedback to panelists fits within a construct map framework.…

Descriptors: Standard Setting (Scoring), Maps, Test Construction, Measurement

Whose Consensus Is It Anyway? Scientific versus Legalistic Conceptions of Validity

Peer reviewed

Direct link

Borsboom, Denny – Measurement: Interdisciplinary Research and Perspectives, 2012

Paul E. Newton provides an insightful and scholarly overview of central issues in validity theory. As he notes, many of the conceptual problems in validity theory derive from the fact that the word "validity" has two meanings. First, it indicates "whether a test measures what it purports to measure." This is a factual claim about the psychometric…

Descriptors: Validity, Psychometrics, Test Interpretation, Scores

How Is Testing Supposed to Improve Schooling?

Peer reviewed

Direct link

Haertel, Edward – Measurement: Interdisciplinary Research and Perspectives, 2013

Validation research for educational achievement tests is often limited to an examination of intended test score interpretations. This article calls for an expansion of validation research in three dimensions. First, validation must attend to actual test use and its consequences, not just score meaning. Second, validation must attend to unintended…

Descriptors: Educational Testing, Educational Improvement, Test Validity, Achievement Tests

Defending the Quality of Links between Scores from Different Tests and Exams

Peer reviewed

Direct link

Cresswell, Mike – Measurement: Interdisciplinary Research and Perspectives, 2010

Paul Newton (2010), with his characteristic concern about theory, has set out two different ways of thinking about the basis upon which equivalences of one sort or another are established between test score scales. His reason for doing this is a desire to establish "the defensibility of linkages lower on the continuum than concordance."…

Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis

Conceptualizing Comparability

Peer reviewed

Direct link

Newton, Paul E. – Measurement: Interdisciplinary Research and Perspectives, 2010

This article presents the author's rejoinder to thinking about linking from issue 8(1). Particularly within the more embracing linking frameworks, e.g., Holland & Dorans (2006) and Holland (2007), there appears to be a major disjunction between (1) classification discourse: the supposed basis for classification, that is, the underlying theory…

Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis

Linking through Improved Design, Not Redefinition: Commentary on Newton

Peer reviewed

Direct link

Walker, Michael E. – Measurement: Interdisciplinary Research and Perspectives, 2010

"Linking" is a term given to a general class of procedures by which one represents scores X on one test or measure in terms of scores Y on another test or measure. A recent taxonomy by Holland and Dorans (2006; Holland, 2007) organizes the various types of links into three broad categories: prediction, scale aligning, and equating. In…

Descriptors: Foreign Countries, Test Construction, Test Validity, Measurement Techniques

What Constitutes Legitimate Causal Linking?

Peer reviewed

Direct link

Baird, Jo-Anne – Measurement: Interdisciplinary Research and Perspectives, 2010

Newton's article (2010) makes three main contributions to the literature. First, it is transatlantic, bringing together literatures that have been dealing with similar problems, using sometimes different methods and certainly with distinctive educational, cultural perspectives. He points out that neither of these literatures has all of the…

Descriptors: Foreign Countries, Predictive Validity, Standards, Ethics

What Dictates the Meaning of Test Linking? A Reaction to "Thinking about Linking"

Peer reviewed

Direct link

von Davier, Alina A. – Measurement: Interdisciplinary Research and Perspectives, 2010

The article "Thinking About Linking" by Newton (2010) presents a novel philosophical perspective on the way that educational assessments should be linked. Newton starts by describing the linking framework as it was characterized in various publications and identifies a cross-cultural dimension in the definitions and uses of test…

Descriptors: Foreign Countries, Educational Assessment, Student Evaluation, Evaluation Criteria

Validating Standards-Based Test Score Interpretations

Peer reviewed

Direct link

Haertel, Edward H.; Lorie, William A. – Measurement: Interdisciplinary Research and Perspectives, 2004

Standards-based score reports interpret test performance with reference to cut scores defining categories like "below basic," "proficient," or "master." This article first develops a conceptual framework for validity arguments supporting such interpretations, then presents three applications. Two of these serve to introduce new standard-setting…

Descriptors: Scores, Test Interpretation, Test Validity, Standard Setting (Scoring)