ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	14

Source

Measurement:…

Author

Haberman, Shelby J.	2
Sinharay, Sandip	2
Bachman, Lyle	1
Bechger, Timo	1
Briggs, Derek C.	1
Carstensen, Claus H.	1
Cramer, Angelique O. J.	1
Frey, Andreas	1
Glas, Cees A. W.	1
Hill, Heather C.	1
Levy, Roy	1
Maris, Gunter	1
Mislevy, Robert J.	1
Newton, Paul E.	1
Schilling, Stephen	1
Tatsuoka, Curtis	1
Zwick, Rebecca	1
More ▼

Publication Type

Journal Articles	14
Opinion Papers	14
Reports - Descriptive	1

Education Level

Elementary Secondary Education

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…

What Works Clearinghouse Rating

Showing all 14 results Save | Export

How Is Educational Measurement Supposed to Deal with Test Use?

Peer reviewed

Direct link

Bachman, Lyle – Measurement: Interdisciplinary Research and Perspectives, 2013

At the outset of his thoughtful and thought-provoking article, Haertel (this issue) clearly identifies the issue with which he will be dealing: The disjunct, or gap, in current approaches to evaluating the merits of a given test, between the intended uses of that test and the validity of its score-based interpretations. The author thinks that…

Descriptors: Educational Testing, Test Use, Test Validity, Test Interpretation

Why the Item "23 + 1" Is Not in a Depression Questionnaire: Validity from a Network Perspective

Peer reviewed

Direct link

Cramer, Angelique O. J. – Measurement: Interdisciplinary Research and Perspectives, 2012

What is validity? A simple question but apparently one with many answers, as Paul Newton highlights in his review of the history of validity. The current definition of validity, as entertained in the 1999 "Standards for Educational and Psychological Testing" is indeed a consensus, one between the classical notion of attributes, and measures…

Descriptors: Validity, Educational Testing, Depression (Psychology), Psychology

Questioning the Consensus Definition of Validity

Peer reviewed

Direct link

Newton, Paul E. – Measurement: Interdisciplinary Research and Perspectives, 2012

This focus article provided the author with an opportunity to unpack the consensus definition of validity and to explore its implications in the light of recent debates. He proposed an elaboration of the consensus definition, which was intended to express the spirit of the "Standards for Educational and Psychological Testing" with increased…

Descriptors: Validity, Educational Testing, Psychological Testing, Definitions

Design under Constraints: The Case of Large-Scale Assessment Systems

Peer reviewed

Direct link

Mislevy, Robert J. – Measurement: Interdisciplinary Research and Perspectives, 2010

In "Updating the Duplex Design for Test-Based Accountability in the Twenty-First Century," Bejar and Graf (2010) propose extensions to the duplex design for large-scale assessment presented in Bock and Mislevy (1988). Examining the range of people who use assessment results--from students, teachers, administrators, curriculum designers,…

Descriptors: Measurement, Test Construction, Educational Testing, Data Collection

Issues with Self-Monitoring Assessments: Comments on Koretz and Beguin (2010)

Peer reviewed

Direct link

Sinharay, Sandip; Haberman, Shelby J.; Zwick, Rebecca – Measurement: Interdisciplinary Research and Perspectives, 2010

Several researchers (e.g., Klein, Hamilton, McCaffrey, & Stecher, 2000; Koretz & Barron, 1998; Linn, 2000) have asserted that test-based accountability, a crucial component of U.S. education policy, has resulted in score inflation. This inference has relied on comparisons with performance on other tests such as the National Assessment of…

Descriptors: Audits (Verification), Test Items, Scores, Measurement

Validate High Stakes Inferences by Designing Good Experiments, Not Audit Items: A Comment on "Self-Monitoring Assessments Educational Accountability Systems"

Peer reviewed

Direct link

Briggs, Derek C. – Measurement: Interdisciplinary Research and Perspectives, 2010

The use of large-scale assessments for making high stakes inferences about students and the schools in which they are situated is premised on the assumption that tests are sensitive to good instruction. An increase in the quality of classroom instruction should cause, on the average, an increase in test scores. In work with a number of colleagues…

Descriptors: Measurement, High Stakes Tests, Inferences, Scores

What IRT Can and Cannot Do

Peer reviewed

Direct link

Glas, Cees A. W. – Measurement: Interdisciplinary Research and Perspectives, 2009

This author states that, while the article by Gunter Maris and Timo Bechger ("On Interpreting the Model Parameters for the Three Parameter Logistic Model," this issue) is highly interesting, the interest is not so much in the practical implications, but rather in the issue of the meaning and role of statistical models in psychometrics and…

Descriptors: Item Response Theory, Measurement, Psychometrics, Models

Evidentiary Reasoning in Diagnostic Classification Models

Peer reviewed

Direct link

Levy, Roy – Measurement: Interdisciplinary Research and Perspectives, 2009

In "Unique Characteristics of Diagnostic Classification Models: A Comprehensive Review of the Current State-of-the-Art," Rupp and Templin (2008) undertake the ambitious task of providing a thorough portrait of the current state of diagnostic classification models (DCM). In this commentary, the author applauds Rupp and Templin for their…

Descriptors: Classification, Models, Evidence, Measurement

Diagnostic Models as Partially Ordered Sets

Peer reviewed

Direct link

Tatsuoka, Curtis – Measurement: Interdisciplinary Research and Perspectives, 2009

In this commentary, the author addresses what is referred to as the deterministic input, noisy "and" gate (DINA) model. The author mentions concerns with how this model has been formulated and presented. In particular, the author points out that there is a lack of recognition of the confounding of profiles that generally arises and then discusses…

Descriptors: Test Items, Classification, Psychometrics, Item Response Theory

Equivalent Diagnostic Classification Models

Peer reviewed

Direct link

Maris, Gunter; Bechger, Timo – Measurement: Interdisciplinary Research and Perspectives, 2009

Rupp and Templin (2008) do a good job at describing the ever expanding landscape of Diagnostic Classification Models (DCM). In many ways, their review article clearly points to some of the questions that need to be answered before DCMs can become part of the psychometric practitioners toolkit. Apart from the issues mentioned in this article that…

Descriptors: Factor Analysis, Classification, Psychometrics, Item Response Theory

How Much Can We Reliably Know about What Examinees Know?

Peer reviewed

Direct link

Sinharay, Sandip; Haberman, Shelby J. – Measurement: Interdisciplinary Research and Perspectives, 2009

In this commentary, the authors discuss some of the issues regarding the use of diagnostic classification models that practitioners should keep in mind. In the authors experience, these issues are not as well known as they should be. The authors then provide recommendations on diagnostic scoring.

Descriptors: Scoring, Reliability, Validity, Classification

Diagnostic Classification Models and Multidimensional Adaptive Testing: A Commentary on Rupp and Templin

Peer reviewed

Direct link

Frey, Andreas; Carstensen, Claus H. – Measurement: Interdisciplinary Research and Perspectives, 2009

On a general level, the objective of diagnostic classifications models (DCMs) lies in a classification of individuals regarding multiple latent skills. In this article, the authors show that this objective can be achieved by multidimensional adaptive testing (MAT) as well. The authors discuss whether or not the restricted applicability of DCMs can…

Descriptors: Adaptive Testing, Test Items, Classification, Psychometrics

Validating the MKT Measures: Some Responses to the Commentaries

Peer reviewed

Direct link

Hill, Heather C. – Measurement: Interdisciplinary Research and Perspectives, 2007

The author offers some thoughts on commentator's reactions to the substance of the measures, particularly those about measuring teacher learning and change, based on the major uses of the measures, and because this is a significant challenge facing test development as an enterprise. If teacher learning results in more integrated knowledge or…

Descriptors: Educational Testing, Tests, Measurement, Faculty Development

Generalizability and Specificity of Interpretive Arguments: Observations Inspired by the Commentaries

Peer reviewed

Direct link

Schilling, Stephen – Measurement: Interdisciplinary Research and Perspectives, 2007

In this article, the author echoes his co-author's and colleague's pleasure (Hill, this issue) at the thoughtfulness and far-ranging nature of the comments to their initial attempts at test validation for the mathematical knowledge for teaching (MKT) measures using the validity argument approach. Because of the large number of commentaries they…

Descriptors: Generalizability Theory, Persuasive Discourse, Educational Testing, Measurement

Educational Testing	14
Measurement	14
Educational Assessment	7
Evaluation Methods	7
Measurement Techniques	7
Psychometrics	7
Evaluation Problems	6
Item Response Theory	6
Models	6
Student Evaluation	6
Classification	5
Criterion Referenced Tests	5
Diagnostic Tests	5
Evidence	5
State of the Art Reviews	5
Test Items	4
Test Validity	4
Content Validity	3
Scores	3
Test Construction	3
Validity	3
Audits (Verification)	2
Evaluation Research	2
Inferences	2
Knowledge Base for Teaching	2
More ▼