Publication Date
In 2025 | 1 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 8 |
Since 2006 (last 20 years) | 9 |
Descriptor
Evaluation Methods | 15 |
Models | 15 |
Item Response Theory | 3 |
Accountability | 2 |
Classification | 2 |
Educational Assessment | 2 |
Goodness of Fit | 2 |
Higher Education | 2 |
Measurement | 2 |
Measurement Techniques | 2 |
Research Methodology | 2 |
More ▼ |
Source
Educational Measurement:… | 15 |
Author
Ames, Allison J. | 1 |
Amy K. Clark | 1 |
Benson, Jeri | 1 |
Cho, Sun-Joo | 1 |
Gierl, Mark J. | 1 |
Goldstein, Harvey | 1 |
Harold Doran | 1 |
Ji, Xuejun Ryan | 1 |
Kim, Stella Y. | 1 |
Lee, Woo-yeol | 1 |
Liu, Ou Lydia | 1 |
More ▼ |
Publication Type
Journal Articles | 15 |
Reports - Evaluative | 9 |
Reports - Descriptive | 4 |
Speeches/Meeting Papers | 3 |
Reports - Research | 2 |
Education Level
Elementary Education | 1 |
Grade 4 | 1 |
Higher Education | 1 |
Intermediate Grades | 1 |
Postsecondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Progress in International… | 1 |
What Works Clearinghouse Rating
W. Jake Thompson; Amy K. Clark – Educational Measurement: Issues and Practice, 2024
In recent years, educators, administrators, policymakers, and measurement experts have called for assessments that support educators in making better instructional decisions. One promising approach to measurement to support instructional decision-making is diagnostic classification models (DCMs). DCMs are flexible psychometric models that…
Descriptors: Decision Making, Instructional Improvement, Evaluation Methods, Models
Reese Butterfuss; Harold Doran – Educational Measurement: Issues and Practice, 2025
Large language models are increasingly used in educational and psychological measurement activities. Their rapidly evolving sophistication and ability to detect language semantics make them viable tools to supplement subject matter experts and their reviews of large amounts of text statements, such as educational content standards. This paper…
Descriptors: Alignment (Education), Academic Standards, Content Analysis, Concept Mapping
Kim, Stella Y. – Educational Measurement: Issues and Practice, 2022
In this digital ITEMS module, Dr. Stella Kim provides an overview of multidimensional item response theory (MIRT) equating. Traditional unidimensional item response theory (IRT) equating methods impose the sometimes untenable restriction on data that only a single ability is assessed. This module discusses potential sources of multidimensionality…
Descriptors: Item Response Theory, Models, Equated Scores, Evaluation Methods
Ji, Xuejun Ryan; Wu, Amery D. – Educational Measurement: Issues and Practice, 2023
The Cross-Classified Mixed Effects Model (CCMEM) has been demonstrated to be a flexible framework for evaluating reliability by measurement specialists. Reliability can be estimated based on the variance components of the test scores. Built upon their accomplishment, this study extends the CCMEM to be used for evaluating validity evidence.…
Descriptors: Measurement, Validity, Reliability, Models
Ma, Wenchao; de la Torre, Jimmy – Educational Measurement: Issues and Practice, 2019
In this ITEMS module, we introduce the generalized deterministic inputs, noisy "and" gate (G-DINA) model, which is a general framework for specifying, estimating, and evaluating a wide variety of cognitive diagnosis models. The module contains a nontechnical introduction to diagnostic measurement, an introductory overview of the G-DINA…
Descriptors: Models, Classification, Measurement, Identification
Liu, Ou Lydia – Educational Measurement: Issues and Practice, 2017
Student learning outcomes assessment has been increasingly used in U.S. higher education institutions over the last 10 years, partly fueled by the recommendation from the Spellings Commission that institutions need to demonstrate more direct evidence of student learning. To respond to the Commission's call, various accountability initiatives have…
Descriptors: College Outcomes Assessment, Accountability, Higher Education, Educational Improvement
Cho, Sun-Joo; Suh, Youngsuk; Lee, Woo-yeol – Educational Measurement: Issues and Practice, 2016
The purpose of this ITEMS module is to provide an introduction to differential item functioning (DIF) analysis using mixture item response models. The mixture item response models for DIF analysis involve comparing item profiles across latent groups, instead of manifest groups. First, an overview of DIF analysis based on latent groups, called…
Descriptors: Test Bias, Research Methodology, Evaluation Methods, Models
Wind, Stefanie A. – Educational Measurement: Issues and Practice, 2017
Mokken scale analysis (MSA) is a probabilistic-nonparametric approach to item response theory (IRT) that can be used to evaluate fundamental measurement properties with less strict assumptions than parametric IRT models. This instructional module provides an introduction to MSA as a probabilistic-nonparametric framework in which to explore…
Descriptors: Probability, Nonparametric Statistics, Item Response Theory, Scaling
Ames, Allison J.; Penfield, Randall D. – Educational Measurement: Issues and Practice, 2015
Drawing valid inferences from item response theory (IRT) models is contingent upon a good fit of the data to the model. Violations of model-data fit have numerous consequences, limiting the usefulness and applicability of the model. This instructional module provides an overview of methods used for evaluating the fit of IRT models. Upon completing…
Descriptors: Item Response Theory, Goodness of Fit, Models, Evaluation Methods

Benson, Jeri – Educational Measurement: Issues and Practice, 1998
Explores how a program of strong construct validation could be applied to the assessment of the construct of test anxiety, paying special attention to substantive, structural, and external aspects of construct validation. A framework is proposed to pull together various statistical methods used in construct validation research into an organized…
Descriptors: Construct Validity, Evaluation Methods, Models, Program Development

Madaus, George F. – Educational Measurement: Issues and Practice, 1992
The need for an independent mechanism that regulates, or audits, the testing enterprise is discussed along with a critique of current mechanisms for challenging a high-stakes test or its use and the need for independent auditing of the commercial test industry. Models for an auditing mechanism are reviewed. (SLD)
Descriptors: Accountability, Elementary Secondary Education, Evaluation Methods, Higher Education
Gierl, Mark J. – Educational Measurement: Issues and Practice, 2005
In this paper I describe and illustrate the Roussos-Stout (1996) multidimensionality-based DIF analysis paradigm, with emphasis on its implication for the selection of a matching and studied subtest for DIF analyses. Standard DIF practice encourages an exploratory search for matching subtest items based on purely statistical criteria, such as a…
Descriptors: Models, Test Items, Test Bias, Statistical Analysis

Nitko, Anthony J. – Educational Measurement: Issues and Practice, 1995
If curriculum is to be the basis for assessment reform, assessment specialists must model the process for producing valid assessment products. Validity criteria should guide any model for the assessment development process. However, curriculum-based assessment systems should not be confused with standards-driven assessment systems. (SLD)
Descriptors: Criteria, Curriculum Based Assessment, Educational Change, Evaluation Methods

Sugrue, Brenda – Educational Measurement: Issues and Practice, 1995
A more fragmented approach to assessment of global ability concepts than is generally advocated is suggested, based on the assumption that decomposing a complex ability into cognitive components and tracking performance across multiple measures will yield valid and instructionally useful information. Specifications are suggested for designing…
Descriptors: Ability, Educational Assessment, Educational Theories, Evaluation Methods

Goldstein, Harvey – Educational Measurement: Issues and Practice, 1994
This article examines how psychometric models based on certain assumptions have come to be used counterproductively by many practitioners in ways that limit the kinds of conclusions that can be made. The general problem of the context's influence on performance is discussed, and some implications are drawn. (SLD)
Descriptors: Context Effect, Educational Research, Evaluation Methods, Measurement Techniques