ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	17

Source

Journal of Educational…

Publication Type

Journal Articles	32
Reports - Evaluative	32
Opinion Papers	1
Speeches/Meeting Papers	1

Education Level

Elementary Secondary Education

Audience

Location

Georgia	2
United Kingdom (Scotland)	1

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	3
National Assessment of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 32 results Save | Export

Measuring the Uncertainty of Imputed Scores

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational Measurement, 2023

Technical difficulties and other unforeseen events occasionally lead to incomplete data on educational tests, which necessitates the reporting of imputed scores to some examinees. While there exist several approaches for reporting imputed scores, there is a lack of any guidance on the reporting of the uncertainty of imputed scores. In this paper,…

Descriptors: Evaluation Methods, Scores, Standardized Tests, Simulation

Studying Score Stability with a Harmonic Regression Family: A Comparison of Three Approaches to Adjustment of Examinee-Specific Demographic Data

Peer reviewed

Direct link

Lee, Yi-Hsuan; Haberman, Shelby J. – Journal of Educational Measurement, 2021

For assessments that use different forms in different administrations, equating methods are applied to ensure comparability of scores over time. Ideally, a score scale is well maintained throughout the life of a testing program. In reality, instability of a score scale can result from a variety of causes, some are expected while others may be…

Descriptors: Scores, Regression (Statistics), Demography, Data

Detecting Differential Item Functioning Using Posterior Predictive Model Checking: A Comparison of Discrepancy Statistics

Peer reviewed

Direct link

Joo, Seang-Hwane; Lee, Philseok – Journal of Educational Measurement, 2022

Abstract This study proposes a new Bayesian differential item functioning (DIF) detection method using posterior predictive model checking (PPMC). Item fit measures including infit, outfit, observed score distribution (OSD), and Q1 were considered as discrepancy statistics for the PPMC DIF methods. The performance of the PPMC DIF method was…

Descriptors: Test Items, Bayesian Statistics, Monte Carlo Methods, Prediction

Two IRT Fixed Parameter Calibration Methods for the Bifactor Model

Peer reviewed

Direct link

Kim, Kyung Yong – Journal of Educational Measurement, 2020

New items are often evaluated prior to their operational use to obtain item response theory (IRT) item parameter estimates for quality control purposes. Fixed parameter calibration is one linking method that is widely used to estimate parameters for new items and place them on the desired scale. This article provides detailed descriptions of two…

Descriptors: Item Response Theory, Evaluation Methods, Test Items, Simulation

Validity Arguments Meet Artificial Intelligence in Innovative Educational Assessment

Peer reviewed

Direct link

Dorsey, David W.; Michaels, Hillary R. – Journal of Educational Measurement, 2022

We have dramatically advanced our ability to create rich, complex, and effective assessments across a range of uses through technology advancement. Artificial Intelligence (AI) enabled assessments represent one such area of advancement--one that has captured our collective interest and imagination. Scientists and practitioners within the domains…

Descriptors: Validity, Ethics, Artificial Intelligence, Evaluation Methods

Gathering and Evaluating Validity Evidence: The Generalized Assessment Alignment Tool

Peer reviewed

Direct link

Cizek, Gregory J.; Kosh, Audra E.; Toutkoushian, Emily K. – Journal of Educational Measurement, 2018

Alignment is an essential piece of validity evidence for both educational (K-12) and credentialing (licensure and certification) assessments. In this article, a comprehensive review of commonly used contemporary alignment procedures is provided; some key weaknesses in current alignment approaches are identified; principles for evaluating alignment…

Descriptors: Test Validity, Evidence, Evaluation Methods, Alignment (Education)

Students' Interpretation of Formative Assessment Feedback: Three Claims for Why We Know so Little about Something so Important

Peer reviewed

Direct link

Leighton, Jacqueline P. – Journal of Educational Measurement, 2019

If K-12 students are to be fully integrated as active participants in their own learning, understanding how they interpret formative assessment feedback is needed. The objective of this article is to advance three claims about why teachers and assessment scholars/specialists may have little understanding of students' interpretation of formative…

Descriptors: Elementary Secondary Education, Formative Evaluation, Feedback (Response), Student Attitudes

Detection of Differential Item Functioning with Nonlinear Regression: A Non-IRT Approach Accounting for Guessing

Peer reviewed

Direct link

Drabinová, Adéla; Martinková, Patrícia – Journal of Educational Measurement, 2017

In this article we present a general approach not relying on item response theory models (non-IRT) to detect differential item functioning (DIF) in dichotomous items with presence of guessing. The proposed nonlinear regression (NLR) procedure for DIF detection is an extension of method based on logistic regression. As a non-IRT approach, NLR can…

Descriptors: Test Items, Regression (Statistics), Guessing (Tests), Identification

Local Observed-Score Kernel Equating

Peer reviewed

Direct link

Wiberg, Marie; van der Linden, Wim J.; von Davier, Alina A. – Journal of Educational Measurement, 2014

Three local observed-score kernel equating methods that integrate methods from the local equating and kernel equating frameworks are proposed. The new methods were compared with their earlier counterparts with respect to such measures as bias--as defined by Lord's criterion of equity--and percent relative error. The local kernel item response…

Descriptors: Measurement Techniques, Evaluation Methods, Item Response Theory, Equated Scores

A Nonparametric Approach to Estimate Classification Accuracy and Consistency

Peer reviewed

Direct link

Lathrop, Quinn N.; Cheng, Ying – Journal of Educational Measurement, 2014

When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…

Descriptors: Cutting Scores, Classification, Computation, Nonparametric Statistics

Distinguishing between Net and Global DIF in Polytomous Items

Peer reviewed

Direct link

Penfield, Randall D. – Journal of Educational Measurement, 2010

In this article, I address two competing conceptions of differential item functioning (DIF) in polytomously scored items. The first conception, referred to as net DIF, concerns between-group differences in the conditional expected value of the polytomous response variable. The second conception, referred to as global DIF, concerns the conditional…

Descriptors: Test Bias, Test Items, Evaluation Methods, Item Response Theory

Impact of Diagnosticity on the Adequacy of Models for Cognitive Diagnosis under a Linear Attribute Structure: A Simulation Study

Peer reviewed

Direct link

de La Torre, Jimmy; Karelitz, Tzur M. – Journal of Educational Measurement, 2009

Compared to unidimensional item response models (IRMs), cognitive diagnostic models (CDMs) based on latent classes represent examinees' knowledge and item requirements using discrete structures. This study systematically examines the viability of retrofitting CDMs to IRM-based data with a linear attribute structure. The study utilizes a procedure…

Descriptors: Simulation, Item Response Theory, Psychometrics, Evaluation Methods

The Hierarchy Consistency Index: Evaluating Person Fit for Cognitive Diagnostic Assessment

Peer reviewed

Direct link

Cui, Ying; Leighton, Jacqueline P. – Journal of Educational Measurement, 2009

In this article, we introduce a person-fit statistic called the hierarchy consistency index (HCI) to help detect misfitting item response vectors for tests developed and analyzed based on a cognitive model. The HCI ranges from -1.0 to 1.0, with values close to -1.0 indicating that students respond unexpectedly or differently from the responses…

Descriptors: Test Length, Simulation, Correlation, Research Methodology

A Comparative Study of IRT Fixed Parameter Calibration Methods

Peer reviewed

Direct link

Kim, Seonghoon – Journal of Educational Measurement, 2006

This article provides technical descriptions of five fixed parameter calibration (FPC) methods, which were based on marginal maximum likelihood estimation via the EM algorithm, and evaluates them through simulation. The five FPC methods described are distinguished from each other by how many times they update the prior ability distribution and by…

Descriptors: Comparative Analysis, Item Response Theory, Evaluation Methods, Computation

Population Invariance in Equating and Linking: Concept and History

Peer reviewed

Direct link

Kolen, Michael J. – Journal of Educational Measurement, 2004

The concept of invariance in equating and linking is traced from the 1950s to the present. A number of research studies that examined population invariance are reviewed. Theory and research suggest that linkings other than equatings are population dependent. Theory also indicates that equatings are population dependent, although when test forms…

Descriptors: Equated Scores, Evaluation Methods, Statistical Analysis

Previous Page | Next Page »

Pages: 1 | 2 | 3

Leighton, Jacqueline P.	2
van der Linden, Wim J.	2
Ankenmann, Robert D.	1
Beaton, Albert E.	1
Breithaupt, Krista	1
Chen, Shu-Ying	1
Cheng, Ying	1
Chuah, Siang Chee	1
Cizek, Gregory J.	1
Cui, Ying	1
Davison, Mark L.	1
Ding, Cody S.	1
Dorans, Neil J.	1
Dorsey, David W.	1
Drabinová, Adéla	1
Engelhard, George, Jr.	1
Englehard, George, Jr.	1
Finch, Holmes	1
Green, Bert F.	1
Haberman, Shelby J.	1
Habing, Brian	1
Johnson, Eugene G.	1
Joo, Seang-Hwane	1
Kamata, Akihito	1
More ▼

Evaluation Methods	32
Item Response Theory	15
Test Items	12
Simulation	10
Comparative Analysis	5
Error of Measurement	5
Evaluation Research	5
College Entrance Examinations	4
Educational Assessment	4
Models	4
Test Bias	4
Computer Assisted Testing	3
Elementary Secondary Education	3
Item Analysis	3
Regression (Statistics)	3
Research Methodology	3
Sample Size	3
Scores	3
Testing Problems	3
Ability	2
Adaptive Testing	2
Cognitive Tests	2
Computation	2
Computer Simulation	2
Diagnostic Tests	2
More ▼