ERIC - Search Results

Publication Date

In 2025	0
Since 2024	3
Since 2021 (last 5 years)	7
Since 2016 (last 10 years)	14
Since 2006 (last 20 years)	16

Descriptor

Goodness of Fit	26
Item Analysis	26
Simulation	26
Test Items	13
Item Response Theory	10
Mathematical Models	8
Comparative Analysis	6
Error of Measurement	6
Factor Analysis	6
Models	6
Statistical Analysis	6
Achievement Tests	5
Evaluation Methods	5
Sample Size	5
Latent Trait Theory	4
Probability	4
Test Construction	4
Difficulty Level	3
Factor Structure	3
Maximum Likelihood Statistics	3
Measurement Techniques	3
Monte Carlo Methods	3
Psychometrics	3
Scores	3
Scoring	3
More ▼

Source

Educational and Psychological…	3
Journal of Educational…	3
Structural Equation Modeling:…	3
Applied Psychological…	2
Journal of Educational and…	2
Measurement:…	2
International Journal of…	1
Journal of Educational…	1
Practical Assessment,…	1
ProQuest LLC	1

Publication Type

Reports - Research	20
Journal Articles	17
Reports - Descriptive	2
Dissertations/Theses -…	1
Information Analyses	1
Reports - Evaluative	1
Reports - General	1
Speeches/Meeting Papers	1

Education Level

Secondary Education	2
Higher Education	1
Postsecondary Education	1

Audience

Practitioners	2
Researchers	1

Location

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	2
Comprehensive Tests of Basic…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 26 results Save | Export

Conceptualizing Correlated Residuals as Item-Level Method Effects in Confirmatory Factor Analysis

Peer reviewed

Direct link

Karl Schweizer; Andreas Gold; Dorothea Krampen; Stefan Troche – Educational and Psychological Measurement, 2024

Conceptualizing two-variable disturbances preventing good model fit in confirmatory factor analysis as item-level method effects instead of correlated residuals avoids violating the principle that residual variation is unique for each item. The possibility of representing such a disturbance by a method factor of a bifactor measurement model was…

Descriptors: Correlation, Factor Analysis, Measurement Techniques, Item Analysis

Latent Class Analysis with Measurement Invariance Testing: Simulation Study to Compare Overall Likelihood Ratio vs Residual Fit Statistics Based Model Selection

Peer reviewed

Direct link

Zsuzsa Bakk – Structural Equation Modeling: A Multidisciplinary Journal, 2024

A standard assumption of latent class (LC) analysis is conditional independence, that is the items of the LC are independent of the covariates given the LCs. Several approaches have been proposed for identifying violations of this assumption. The recently proposed likelihood ratio approach is compared to residual statistics (bivariate residuals…

Descriptors: Goodness of Fit, Error of Measurement, Comparative Analysis, Models

Identifying Response Styles Using Person Fit Analysis and Response-Styles Models

Peer reviewed

Direct link

Wind, Stefanie A.; Ge, Yuan – Measurement: Interdisciplinary Research and Perspectives, 2023

In selected-response assessments such as attitude surveys with Likert-type rating scales, examinees often select from rating scale categories to reflect their locations on a construct. Researchers have observed that some examinees exhibit "response styles," which are systematic patterns of responses in which examinees are more likely to…

Descriptors: Goodness of Fit, Responses, Likert Scales, Models

Using Item Scores and Distractors to Detect Item Compromise and Preknowledge

Peer reviewed

Direct link

Gorney, Kylie; Wollack, James A.; Sinharay, Sandip; Eckerly, Carol – Journal of Educational and Behavioral Statistics, 2023

Any time examinees have had access to items and/or answers prior to taking a test, the fairness of the test and validity of test score interpretations are threatened. Therefore, there is a high demand for procedures to detect both compromised items (CI) and examinees with preknowledge (EWP). In this article, we develop a procedure that uses item…

Descriptors: Scores, Test Validity, Test Items, Prior Learning

Does Acquiescence Disagree with Measurement Invariance Testing?

Peer reviewed

Direct link

E. Damiano D'Urso; Jesper Tijmstra; Jeroen K. Vermunt; Kim De Roover – Structural Equation Modeling: A Multidisciplinary Journal, 2024

Measurement invariance (MI) is required for validly comparing latent constructs measured by multiple ordinal self-report items. Non-invariances may occur when disregarding (group differences in) an acquiescence response style (ARS; an agreeing tendency regardless of item content). If non-invariance results solely from neglecting ARS, one should…

Descriptors: Error of Measurement, Structural Equation Models, Construct Validity, Measurement Techniques

Using Cumulative Sum Control Chart to Detect Aberrant Responses in Educational Assessments

Peer reviewed
PDF on ERIC

Download full text

Wan, Siyu; Keller, Lisa A. – Practical Assessment, Research & Evaluation, 2023

Statistical process control (SPC) charts have been widely used in the field of educational measurement. The cumulative sum (CUSUM) is an established SPC method to detect aberrant responses for educational assessments. There are many studies that investigated the performance of CUSUM in different test settings. This paper describes the CUSUM…

Descriptors: Visual Aids, Educational Assessment, Evaluation Methods, Item Response Theory

Classification of Scale Items with Exploratory Graph Analysis and Machine Learning Methods

Peer reviewed
PDF on ERIC

Download full text

Koyuncu, Ilhan; Kilic, Abdullah Faruk – International Journal of Assessment Tools in Education, 2021

In exploratory factor analysis, although the researchers decide which items belong to which factors by considering statistical results, the decisions taken sometimes can be subjective in case of having items with similar factor loadings and complex factor structures. The aim of this study was to examine the validity of classifying items into…

Descriptors: Classification, Graphs, Factor Analysis, Decision Making

Impact of Violations of Measurement Invariance in Longitudinal Mediation Modeling

Direct link

Xu, Jie – ProQuest LLC, 2019

Research has shown that cross-sectional mediation analysis cannot accurately reflect a true longitudinal mediated effect. To investigate longitudinal mediated effects, different longitudinal mediation models have been proposed and these models focus on different research questions related to longitudinal mediation. When fitting mediation models to…

Descriptors: Case Studies, Error of Measurement, Longitudinal Studies, Models

Monte Carlo Simulation in Item Response Theory Applications Using SAS

Peer reviewed

Direct link

Ames, Allison J.; Leventhal, Brian C.; Ezike, Nnamdi C. – Measurement: Interdisciplinary Research and Perspectives, 2020

Data simulation and Monte Carlo simulation studies are important skills for researchers and practitioners of educational and psychological measurement, but there are few resources on the topic specific to item response theory. Even fewer resources exist on the statistical software techniques to implement simulation studies. This article presents…

Descriptors: Monte Carlo Methods, Item Response Theory, Simulation, Computer Software

Item Response Models for Multiple Attempts with Incomplete Data

Peer reviewed

Direct link

Bergner, Yoav; Choi, Ikkyu; Castellano, Katherine E. – Journal of Educational Measurement, 2019

Allowance for multiple chances to answer constructed response questions is a prevalent feature in computer-based homework and exams. We consider the use of item response theory in the estimation of item characteristics and student ability when multiple attempts are allowed but no explicit penalty is deducted for extra tries. This is common…

Descriptors: Models, Item Response Theory, Homework, Computer Assisted Instruction

Sensitivity of the RMSD for Detecting Item-Level Misfit in Low-Performing Countries

Peer reviewed

Direct link

Tijmstra, Jesper; Bolsinova, Maria; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2020

Although the root-mean squared deviation (RMSD) is a popular statistical measure for evaluating country-specific item-level misfit (i.e., differential item functioning [DIF]) in international large-scale assessment, this paper shows that its sensitivity to detect misfit may depend strongly on the proficiency distribution of the considered…

Descriptors: Test Items, Goodness of Fit, Probability, Accuracy

Effect Size Measures for Differential Item Functioning in a Multidimensional IRT Model

Peer reviewed

Direct link

Suh, Youngsuk – Journal of Educational Measurement, 2016

This study adapted an effect size measure used for studying differential item functioning (DIF) in unidimensional tests and extended the measure to multidimensional tests. Two effect size measures were considered in a multidimensional item response theory model: signed weighted P-difference and unsigned weighted P-difference. The performance of…

Descriptors: Effect Size, Goodness of Fit, Statistical Analysis, Statistical Significance

Item Response Data Analysis Using Stata Item Response Theory Package

Peer reviewed

Direct link

Yang, Ji Seung; Zheng, Xiaying – Journal of Educational and Behavioral Statistics, 2018

The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…

Descriptors: Item Response Theory, Item Analysis, Computer Software, Statistical Analysis

Reliability and Model Fit

Peer reviewed

Direct link

Stanley, Leanne M.; Edwards, Michael C. – Educational and Psychological Measurement, 2016

The purpose of this article is to highlight the distinction between the reliability of test scores and the fit of psychometric measurement models, reminding readers why it is important to consider both when evaluating whether test scores are valid for a proposed interpretation and/or use. It is often the case that an investigator judges both the…

Descriptors: Test Reliability, Goodness of Fit, Scores, Patients

Evaluation of Two Types of Differential Item Functioning in Factor Mixture Models with Binary Outcomes

Peer reviewed

Direct link

Lee, HwaYoung; Beretvas, S. Natasha – Educational and Psychological Measurement, 2014

Conventional differential item functioning (DIF) detection methods (e.g., the Mantel-Haenszel test) can be used to detect DIF only across observed groups, such as gender or ethnicity. However, research has found that DIF is not typically fully explained by an observed variable. True sources of DIF may include unobserved, latent variables, such as…

Descriptors: Item Analysis, Factor Structure, Bayesian Statistics, Goodness of Fit

Previous Page | Next Page »

Pages: 1 | 2

Reckase, Mark D.	3
Dinero, Thomas E.	2
Haertel, Edward	2
Ames, Allison J.	1
Andreas Gold	1
Beretvas, S. Natasha	1
Bergner, Yoav	1
Bolsinova, Maria	1
Castellano, Katherine E.	1
Choi, Ikkyu	1
Cook, Linda L.	1
Curry, Allen R.	1
Dorothea Krampen	1
E. Damiano D'Urso	1
Eckerly, Carol	1
Edwards, Michael C.	1
Ezike, Nnamdi C.	1
Ge, Yuan	1
Gorney, Kylie	1
Hagtvet, Knut A.	1
Hambleton, Ronald K.	1
Jeroen K. Vermunt	1
Jesper Tijmstra	1
Karl Schweizer	1
More ▼