ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	8
Since 2016 (last 10 years)	12
Since 2006 (last 20 years)	32

Descriptor

Comparative Analysis	37
Evaluation Methods	21
Models	20
Item Response Theory	15
Measurement Techniques	15
Measurement	14
Psychometrics	14
Evaluation Problems	11
Foreign Countries	11
Classification	10
Educational Assessment	8
Evaluation Criteria	8
Evaluation Research	8
Equated Scores	7
Bibliometrics	6
Goodness of Fit	6
Physics	6
Scientific Research	6
Definitions	5
Educational Testing	5
High Stakes Tests	5
Item Analysis	5
Predictive Measurement	5
Research Methodology	5
Social Sciences	5
More ▼

Source

Measurement:…

Publication Type

Journal Articles	37
Opinion Papers	21
Reports - Research	11
Reports - Evaluative	4
Reports - Descriptive	2

Education Level

Elementary Secondary Education	5
Higher Education	4
Postsecondary Education	4
Elementary Education	2
Grade 8	1
High Schools	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

United States	6
United Kingdom (England)	4
Germany	2
Italy	2
United Kingdom	2
United Kingdom (Wales)	2
Australia	1
Bermuda	1
Canada	1
Hungary	1
Norway	1
South Korea	1
Sweden	1
Switzerland	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…	2
SAT (College Admission Test)	2
National Assessment of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 37 results Save | Export

Item Response Theory and Modeling with Stata

Peer reviewed

Direct link

Raykov, Tenko – Measurement: Interdisciplinary Research and Perspectives, 2023

This software review discusses the capabilities of Stata to conduct item response theory modeling. The commands needed for fitting the popular one-, two-, and three-parameter logistic models are initially discussed. The procedure for testing the discrimination parameter equality in the one-parameter model is then outlined. The commands for fitting…

Descriptors: Item Response Theory, Models, Comparative Analysis, Item Analysis

Validation and Implementation of Customer Classification System Using Machine Learning

Peer reviewed

Direct link

Hyemin Yoon; HyunJin Kim; Sangjin Kim – Measurement: Interdisciplinary Research and Perspectives, 2024

We have maintained the customer grade system that is being implemented to customers with excellent performance through customer segmentation for years. Currently, financial institutions that operate the customer grade system provide similar services based on the score calculation criteria, but the score calculation criteria vary from the financial…

Descriptors: Classification, Artificial Intelligence, Prediction, Decision Making

Performance of Nonparametric Person-Fit Statistics with Unfolding versus Dominance Response Models

Peer reviewed

Direct link

Reimers, Jennifer; Turner, Ronna C.; Tendeiro, Jorge N.; Lo, Wen-Juo; Keiffer, Elizabeth – Measurement: Interdisciplinary Research and Perspectives, 2023

Person-fit analyses are commonly used to detect aberrant responding in self-report data. Nonparametric person fit statistics do not require fitting a parametric test theory model and have performed well compared to other person-fit statistics. However, detection of aberrant responding has primarily focused on dominance response data, thus the…

Descriptors: Goodness of Fit, Nonparametric Statistics, Error of Measurement, Comparative Analysis

Rater Connections and the Detection of Bias in Performance Assessment

Peer reviewed

Direct link

Wind, Stefanie A. – Measurement: Interdisciplinary Research and Perspectives, 2022

In many performance assessments, one or two raters from the complete rater pool scores each performance, resulting in a sparse rating design, where there are limited observations of each rater relative to the complete sample of students. Although sparse rating designs can be constructed to facilitate estimation of student achievement, the…

Descriptors: Evaluators, Bias, Identification, Performance Based Assessment

The Comparison of Estimation Methods for the Four-Parameter Logistic Item Response Theory Model

Peer reviewed

Direct link

Kalkan, Ömür Kaya – Measurement: Interdisciplinary Research and Perspectives, 2022

The four-parameter logistic (4PL) Item Response Theory (IRT) model has recently been reconsidered in the literature due to the advances in the statistical modeling software and the recent developments in the estimation of the 4PL IRT model parameters. The current simulation study evaluated the performance of expectation-maximization (EM),…

Descriptors: Comparative Analysis, Sample Size, Test Length, Algorithms

A Tree-Based Approach to Identifying Response Styles with Anchoring Vignettes

Peer reviewed

Direct link

Leventhal, Brian C.; Zigler, Christina K. – Measurement: Interdisciplinary Research and Perspectives, 2023

Survey score interpretations are often plagued by sources of construct-irrelevant variation, such as response styles. In this study, we propose the use of an IRTree Model to account for response styles by making use of self-report items and anchoring vignettes. Specifically, we investigate how the IRTree approach with anchoring vignettes compares…

Descriptors: Scores, Vignettes, Response Style (Tests), Item Response Theory

Comparison of R Packages for Automated Test Assembly with Mixed-Integer Linear Programming

Peer reviewed

Direct link

Peabody, Michael R. – Measurement: Interdisciplinary Research and Perspectives, 2023

Many organizations utilize some form of automation in the test assembly process; either fully algorithmic or heuristically constructed. However, one issue with heuristic models is that when the test assembly problem changes the entire model may need to be re-conceptualized and recoded. In contrast, mixed-integer programming (MIP) is a mathematical…

Descriptors: Programming Languages, Algorithms, Heuristics, Mathematical Models

Closing "Reporting" Gaps: A Comparison of Methods for Estimating Unreported Subgroup Achievement on NAEP

Peer reviewed

Direct link

David Bamat – Measurement: Interdisciplinary Research and Perspectives, 2024

The National Assessment of Educational Progress (NAEP) program only reports state-level subgroup results if it samples at least 62 students identifying with the subgroup. Since some subgroups constitute small proportions of many states' general student populations, these minority subgroups are seldom sufficiently sampled to meet this sample size…

Descriptors: Reading Achievement, Achievement Gap, Prediction, National Competency Tests

Person-Fit Assessment under the D-Scoring Method

Peer reviewed

Direct link

Dimitrov, Dimiter M.; Atanasov, Dimitar V.; Luo, Yong – Measurement: Interdisciplinary Research and Perspectives, 2020

This study examines and compares four person-fit statistics (PFSs) in the framework of the "D"- scoring method (DSM): (a) van der Flier's "U3" statistic; (b) "Ud" statistic, as a modification of "U3" under the DSM; (c) "Zd" statistic, as a modification of the "Z3 (l[subscript z])"…

Descriptors: Goodness of Fit, Item Analysis, Item Response Theory, Scoring

Diagnosing Diagnostic Models: From Von Neumann's Elephant to Model Equivalencies and Network Psychometrics

Peer reviewed

Direct link

von Davier, Matthias – Measurement: Interdisciplinary Research and Perspectives, 2018

This article critically reviews how diagnostic models have been conceptualized and how they compare to other approaches used in educational measurement. In particular, certain assumptions that have been taken for granted and used as defining characteristics of diagnostic models are reviewed and it is questioned whether these assumptions are the…

Descriptors: Criticism, Psychometrics, Diagnostic Tests, Educational Assessment

GDINA and CDM Packages in R

Peer reviewed

Direct link

Rupp, André A.; van Rijn, Peter W. – Measurement: Interdisciplinary Research and Perspectives, 2018

We review the GIDNA and CDM packages in R for fitting cognitive diagnosis/diagnostic classification models. We first provide a summary of their core capabilities and then use both simulated and real data to compare their functionalities in practice. We found that the most relevant routines in the two packages appear to be more similar than…

Descriptors: Educational Assessment, Cognitive Measurement, Measurement, Computer Software

Linking Large-Scale Reading Assessments: Measuring International Trends over 40 Years

Peer reviewed

Direct link

Strietholt, Rolf; Rosén, Monica – Measurement: Interdisciplinary Research and Perspectives, 2016

Since the start of the new millennium, international comparative large-scale studies have become one of the most well-known areas in the field of education. However, the International Association for the Evaluation of Educational Achievement (IEA) has already been conducting international comparative studies for about half a century. The present…

Descriptors: Reading Tests, Comparative Analysis, Comparative Education, Trend Analysis

Why Should We Assess the Goodness-of-Fit of IRT Models?

Peer reviewed

Direct link

Maydeu-Olivares, Alberto – Measurement: Interdisciplinary Research and Perspectives, 2013

In this rejoinder, Maydeu-Olivares states that, in item response theory (IRT) measurement applications, the application of goodness-of-fit (GOF) methods informs researchers of the discrepancy between the model and the data being fitted (the room for improvement). By routinely reporting the GOF of IRT models, together with the substantive results…

Descriptors: Goodness of Fit, Models, Evaluation Methods, Item Response Theory

Old Issues in a New Jacket: Power and Validation in the Context of Mixture Modeling

Peer reviewed

Direct link

Lubke, Gitta – Measurement: Interdisciplinary Research and Perspectives, 2012

Von Davier et al. (this issue) describe two analyses that aim at determining whether the constructs measured with a number of observed items are categorical or continuous in nature. The issue of types versus traits has a long history and is relevant in many areas of behavioral research, including personality research, as emphasized by von Davier…

Descriptors: Models, Classification, Multivariate Analysis, Statistical Analysis

Thinking about Linking

Peer reviewed

Direct link

Newton, Paul – Measurement: Interdisciplinary Research and Perspectives, 2010

Despite over a century of aligning test and examination scales, the theory of linking has received relatively little attention. Recently, though, frameworks for classifying linking relationships have proliferated, both in England and the United States. Limitations of U.S. frameworks, particularly the idea that linking relationships ought to be…

Descriptors: Foreign Countries, Models, Comparative Analysis, Evaluation

Previous Page | Next Page »

Pages: 1 | 2 | 3

Humphry, Stephen M.	2
von Davier, Matthias	2
Andrich, David	1
Atanasov, Dimitar V.	1
Baird, Jo-Anne	1
Cresswell, Mike	1
David Bamat	1
Dimitrov, Dimiter M.	1
Engelhard, George, Jr.	1
Feller, Irwin	1
Gorin, Joanna S.	1
Grimm, Kevin J.	1
Heene, Moritz	1
Hyemin Yoon	1
HyunJin Kim	1
Kalkan, Ömür Kaya	1
Kane, Michael T.	1
Katz, J. Sylvan	1
Keiffer, Elizabeth	1
Kyngdon, Andrew	1
Leventhal, Brian C.	1
Lo, Wen-Juo	1
Lubke, Gitta	1
Luo, Yong	1
Malesios, Chrisovaladis C.	1
More ▼