ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	7
Since 2016 (last 10 years)	14
Since 2006 (last 20 years)	25

Descriptor

Statistical Analysis	26
Test Items	26
Models	9
Item Response Theory	7
Test Bias	7
Goodness of Fit	6
Bayesian Statistics	5
Computation	5
Cheating	4
Foreign Countries	4
Probability	4
Reaction Time	4
Scores	4
Simulation	4
Accuracy	3
Comparative Analysis	3
Equated Scores	3
Identification	3
Maximum Likelihood Statistics	3
Regression (Statistics)	3
Responses	3
Achievement Tests	2
College Entrance Examinations	2
Computer Assisted Testing	2
Difficulty Level	2
More ▼

Source

Journal of Educational and…

Publication Type

Journal Articles	26
Reports - Research	17
Reports - Descriptive	5
Reports - Evaluative	3
Opinion Papers	1

Education Level

Higher Education	5
Postsecondary Education	4
Secondary Education	3
Middle Schools	2
Elementary Education	1
Junior High Schools	1

Audience

Location

Netherlands (Amsterdam)	1
Sweden	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	2
National Assessment of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 26 results Save | Export

Optimizing Diagnostic Classification Models Application Considering Real-Life Constraints

Peer reviewed

Direct link

Su, Kun; Henson, Robert A. – Journal of Educational and Behavioral Statistics, 2023

This article provides a process to carefully evaluate the suitability of a content domain for which diagnostic classification models (DCMs) could be applicable and then optimized steps for constructing a test blueprint for applying DCMs and a real-life example illustrating this process. The content domains were carefully evaluated using a set of…

Descriptors: Classification, Models, Science Tests, Physics

On the Generalized S-X[superscript 2]-Test of Item Fit: Some Variants, Residuals, and a Graphical Visualization

Peer reviewed

Direct link

Ranger, Jochen; Brauer, Kay – Journal of Educational and Behavioral Statistics, 2022

The generalized S-X[superscript 2]-test is a test of item fit for items with polytomous responses format. The test is based on a comparison of the observed and expected number of responses in strata defined by the test score. In this article, we make four contributions. We demonstrate that the performance of the generalized S-X[superscript 2]-test…

Descriptors: Goodness of Fit, Test Items, Statistical Analysis, Item Response Theory

Testing Differential Item Functioning without Predefined Anchor Items Using Robust Regression

Peer reviewed

Direct link

Wang, Weimeng; Liu, Yang; Liu, Hongyun – Journal of Educational and Behavioral Statistics, 2022

Differential item functioning (DIF) occurs when the probability of endorsing an item differs across groups for individuals with the same latent trait level. The presence of DIF items may jeopardize the validity of an instrument; therefore, it is crucial to identify DIF items in routine operations of educational assessment. While DIF detection…

Descriptors: Test Bias, Test Items, Equated Scores, Regression (Statistics)

Detecting Item Preknowledge Using Revisits with Speed and Accuracy

Peer reviewed

Direct link

Demirkaya, Onur; Bezirhan, Ummugul; Zhang, Jinming – Journal of Educational and Behavioral Statistics, 2023

Examinees with item preknowledge tend to obtain inflated test scores that undermine test score validity. With the availability of process data collected in computer-based assessments, the research on detecting item preknowledge has progressed on using both item scores and response times. Item revisit patterns of examinees can also be utilized as…

Descriptors: Test Items, Prior Learning, Knowledge Level, Reaction Time

Finding the Right Grain-Size for Measurement in the Classroom

Peer reviewed

Direct link

Mark Wilson – Journal of Educational and Behavioral Statistics, 2024

This article introduces a new framework for articulating how educational assessments can be related to teacher uses in the classroom. It articulates three levels of assessment: macro (use of standardized tests), meso (externally developed items), and micro (on-the-fly in the classroom). The first level is the usual context for educational…

Descriptors: Educational Assessment, Measurement, Standardized Tests, Test Items

Mean Comparisons of Many Groups in the Presence of DIF: An Evaluation of Linking and Concurrent Scaling Approaches

Peer reviewed

Direct link

Robitzsch, Alexander; Lüdtke, Oliver – Journal of Educational and Behavioral Statistics, 2022

One of the primary goals of international large-scale assessments in education is the comparison of country means in student achievement. This article introduces a framework for discussing differential item functioning (DIF) for such mean comparisons. We compare three different linking methods: concurrent scaling based on full invariance,…

Descriptors: Test Bias, International Assessment, Scaling, Comparative Analysis

Detecting Compromised Items Using Information from Secure Items

Peer reviewed

Direct link

Wang, Xi; Liu, Yang – Journal of Educational and Behavioral Statistics, 2020

In continuous testing programs, some items are repeatedly used across test administrations, and statistical methods are often used to evaluate whether items become compromised due to examinees' preknowledge. In this study, we proposed a residual method to detect compromised items when a test can be partitioned into two subsets of items: secure…

Descriptors: Test Items, Information Security, Error of Measurement, Cheating

Testing the Within-State Distribution in Mixture Models for Responses and Response Times

Peer reviewed

Direct link

Kuijpers, Renske E.; Visser, Ingmar; Molenaar, Dylan – Journal of Educational and Behavioral Statistics, 2021

Mixture models have been developed to enable detection of within-subject differences in responses and response times to psychometric test items. To enable mixture modeling of both responses and response times, a distributional assumption is needed for the within-state response time distribution. Since violations of the assumed response time…

Descriptors: Test Items, Responses, Reaction Time, Models

Kernel Equating Using Propensity Scores for Nonequivalent Groups

Peer reviewed

Direct link

Wallin, Gabriel; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2019

When equating two test forms, the equated scores will be biased if the test groups differ in ability. To adjust for the ability imbalance between nonequivalent groups, a set of common items is often used. When no common items are available, it has been suggested to use covariates correlated with the test scores instead. In this article, we reduce…

Descriptors: Equated Scores, Test Items, Probability, College Entrance Examinations

Research on Psychometric Modeling, Analysis, and Reporting of the National Assessment of Educational Progress

Peer reviewed
PDF on ERIC

Download full text

Direct link

Oranje, Andreas; Kolstad, Andrew – Journal of Educational and Behavioral Statistics, 2019

The design and psychometric methodology of the National Assessment of Educational Progress (NAEP) is constantly evolving to meet the changing interests and demands stemming from a rapidly shifting educational landscape. NAEP has been built on strong research foundations that include conducting extensive evaluations and comparisons before new…

Descriptors: National Competency Tests, Psychometrics, Statistical Analysis, Computation

Detection of Item Preknowledge Using Likelihood Ratio Test and Score Test

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2017

An increasing concern of producers of educational assessments is fraudulent behavior during the assessment (van der Linden, 2009). Benefiting from item preknowledge (e.g., Eckerly, 2017; McLeod, Lewis, & Thissen, 2003) is one type of fraudulent behavior. This article suggests two new test statistics for detecting individuals who may have…

Descriptors: Test Items, Cheating, Testing Problems, Identification

Normal Theory Two-Stage ML Estimator When Data Are Missing at the Item Level

Peer reviewed

Direct link

Savalei, Victoria; Rhemtulla, Mijke – Journal of Educational and Behavioral Statistics, 2017

In many modeling contexts, the variables in the model are linear composites of the raw items measured for each participant; for instance, regression and path analysis models rely on scale scores, and structural equation models often use parcels as indicators of latent constructs. Currently, no analytic estimation method exists to appropriately…

Descriptors: Computation, Statistical Analysis, Test Items, Maximum Likelihood Statistics

Detection of Uniform and Nonuniform Differential Item Functioning by Item-Focused Trees

Peer reviewed

Direct link

Berger, Moritz; Tutz, Gerhard – Journal of Educational and Behavioral Statistics, 2016

Detection of differential item functioning (DIF) by use of the logistic modeling approach has a long tradition. One big advantage of the approach is that it can be used to investigate nonuniform (NUDIF) as well as uniform DIF (UDIF). The classical approach allows one to detect DIF by distinguishing between multiple groups. We propose an…

Descriptors: Test Bias, Regression (Statistics), Nonparametric Statistics, Statistical Analysis

Detection of Differential Item Functioning Using the Lasso Approach

Peer reviewed

Direct link

Magis, David; Tuerlinckx, Francis; De Boeck, Paul – Journal of Educational and Behavioral Statistics, 2015

This article proposes a novel approach to detect differential item functioning (DIF) among dichotomously scored items. Unlike standard DIF methods that perform an item-by-item analysis, we propose the "LR lasso DIF method": logistic regression (LR) model is formulated for all item responses. The model contains item-specific intercepts,…

Descriptors: Test Bias, Test Items, Regression (Statistics), Scores

Item Response Data Analysis Using Stata Item Response Theory Package

Peer reviewed

Direct link

Yang, Ji Seung; Zheng, Xiaying – Journal of Educational and Behavioral Statistics, 2018

The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…

Descriptors: Item Response Theory, Item Analysis, Computer Software, Statistical Analysis

Previous Page | Next Page »

Pages: 1 | 2

Liu, Yang	2
Ranger, Jochen	2
Sinharay, Sandip	2
Wainer, Howard	2
Wang, Xiaohui	2
Wiberg, Marie	2
van der Linden, Wim J.	2
Andrich, David	1
Berger, Moritz	1
Bezirhan, Ummugul	1
Bradlow, Eric	1
Bradlow, Eric T.	1
Brauer, Kay	1
Chang, Hua-Hua	1
De Boeck, Paul	1
Demirkaya, Onur	1
Douglas, Jeffrey A.	1
Fan, Zhewen	1
Hagquist, Curt	1
Henson, Robert A.	1
Isham, Steven	1
Jeon, Minjeong	1
Kolstad, Andrew	1
Kuhn, Jörg-Tobias	1
Kuijpers, Renske E.	1
More ▼