ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	7

Descriptor

Evaluation Methods	12
Multivariate Analysis	12
Error of Measurement	4
Measurement Techniques	4
Simulation	4
Correlation	3
Item Response Theory	3
Sample Size	3
Scores	3
Test Items	3
Comparative Analysis	2
Computation	2
Data Analysis	2
Foreign Countries	2
Generalization	2
Models	2
Psychometrics	2
Regression (Statistics)	2
Research Methodology	2
Statistical Analysis	2
Test Bias	2
Behavioral Science Research	1
Clinical Experience	1
Communication Skills	1
Discriminant Analysis	1
More ▼

Source

Educational and Psychological…

Publication Type

Journal Articles	12
Reports - Research	6
Reports - Descriptive	3
Reports - Evaluative	3

Education Level

Elementary Education	1
Grade 6	1
Higher Education	1
Intermediate Grades	1
Middle Schools	1

Audience

Location

Canada	2
Australia	1
China	1
Hong Kong	1
India	1
Japan	1
South Korea	1
Taiwan	1
United Kingdom	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 12 results Save | Export

A Log-Linear Modeling Approach for Differential Item Functioning Detection in Polytomously Scored Items

Peer reviewed

Direct link

Yesiltas, Gonca; Paek, Insu – Educational and Psychological Measurement, 2020

A log-linear model (LLM) is a well-known statistical method to examine the relationship among categorical variables. This study investigated the performance of LLM in detecting differential item functioning (DIF) for polytomously scored items via simulations where various sample sizes, ability mean differences (impact), and DIF types were…

Descriptors: Simulation, Sample Size, Item Analysis, Scores

Scale Reliability Evaluation with Heterogeneous Populations

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2015

A latent variable modeling approach for scale reliability evaluation in heterogeneous populations is discussed. The method can be used for point and interval estimation of reliability of multicomponent measuring instruments in populations representing mixtures of an unknown number of latent classes or subpopulations. The procedure is helpful also…

Descriptors: Test Reliability, Evaluation Methods, Measurement Techniques, Computation

Examining Student Factors in Sources of Setting Accommodation DIF

Peer reviewed

Direct link

Lin, Pei-Ying; Lin, Yu-Cheng – Educational and Psychological Measurement, 2014

This exploratory study investigated potential sources of setting accommodation resulting in differential item functioning (DIF) on math and reading assessments for examinees with varied learning characteristics. The examinees were those who participated in large-scale assessments and were tested in either standardized or accommodated testing…

Descriptors: Test Bias, Multivariate Analysis, Testing Accommodations, Mathematics Tests

The Evidence for a Subscore Structure in a Test of English Language Competency for English Language Learners

Peer reviewed

Direct link

Reckase, Mark D.; Xu, Jing-Ru – Educational and Psychological Measurement, 2015

How to compute and report subscores for a test that was originally designed for reporting scores on a unidimensional scale has been a topic of interest in recent years. In the research reported here, we describe an application of multidimensional item response theory to identify a subscore structure in a test designed for reporting results using a…

Descriptors: English, Language Skills, English Language Learners, Scores

A Measure of Agreement for Interval or Nominal Multivariate Observations by Different Sets of Judges

Peer reviewed

Direct link

Janson, Harald; Olsson, Ulf – Educational and Psychological Measurement, 2004

This article addresses the problem of accounting overall multivariate chance-corrected interobserver agreement when targets have been rated by different sets of judges (not necessarily equal in number). The proposed approach builds on Janson and Olsson's multivariate generalization of Cohen's kappa but incorporates weighting for number of judges…

Descriptors: Interrater Reliability, Multivariate Analysis, Evaluation Methods, Measurement Techniques

Understanding Parameter Invariance in Unidimensional IRT Models

Peer reviewed

Direct link

Rupp, Andre A.; Zumbo, Bruno D. – Educational and Psychological Measurement, 2006

One theoretical feature that makes item response theory (IRT) models those of choice for many psychometric data analysts is parameter invariance, the equality of item and examinee parameters from different examinee populations or measurement conditions. In this article, using the well-known fact that item and examinee parameters are identical only…

Descriptors: Psychometrics, Probability, Simulation, Item Response Theory

Impact of Post Hoc Measurement Model Overspecification on Structural Parameter Integrity

Peer reviewed

Direct link

Fan, Weihua; Hancock, Gregory R. – Educational and Psychological Measurement, 2006

In the common two-step structural equation modeling process, modifications are routinely made to the measurement portion of the model prior to assessing structural relations. The effect of such measurement model modifications on the structural parameter estimates, however, is not well known and is the subject of the current investigation. For a…

Descriptors: Error of Measurement, Evaluation Methods, Monte Carlo Methods, Sample Size

Why Multivariable Analyses?

Peer reviewed

Huberty, Carl J. – Educational and Psychological Measurement, 1994

Purposes of multivariate analyses are discussed, focusing on the primary purposes of prediction and structure identification and the secondary purpose of response variable ordering. The sound initial choice of response variables and the advisability of simpler analyses when feasible are discussed. (SLD)

Descriptors: Evaluation Methods, Evaluation Utilization, Measurement Techniques, Multivariate Analysis

A Note on How to Quantify and Report Whether IRT Parameter Invariance Holds: When Pearson Correlations are Not Enough

Peer reviewed

Direct link

Rupp, Andre A.; Zumbo, Bruno D. – Educational and Psychological Measurement, 2004

Based on seminal work by Lord and Hambleton, Swaminathan, and Rogers, this article is an analytical, graphical, and conceptual reminder that item response theory (IRT) parameter invariance only holds for perfect model fit in multiple populations or across multiple conditions and is thus an ideal state. In practice, one attempts to quantify the…

Descriptors: Correlation, Item Response Theory, Statistical Analysis, Evaluation Methods

A Comparison of Four Methods for Detecting Differential Item Functioning in Ordered Response Items

Peer reviewed

Direct link

Kristjansson, Elizabeth; Aylesworth, Richard; Mcdowell, Ian; Zumbo, Bruno D. – Educational and Psychological Measurement, 2005

Item bias is a major threat to measurement validity. Methods for detecting differential item functioning (DIF) are now commonly used to identify potentially biased items. DIF detection methods for dichotomous items are well developed, but those for ordinal items are less well developed. In this article, the authors compare four methods for…

Descriptors: Discriminant Analysis, Test Bias, Multivariate Analysis, Regression (Statistics)

A Comparison of the Bootstrap-F, Improved General Approximation, and Brown-Forsythe Multivariate Approaches in a Mixed Repeated Measures Design

Peer reviewed

Direct link

Seco, Guillermo Vallejo; Izquierdo, Marcelino Cuesta; Garcia, M. Paula Fernandez; Diez, F. Javier Herrero – Educational and Psychological Measurement, 2006

The authors compare the operating characteristics of the bootstrap-F approach, a direct extension of the work of Berkovits, Hancock, and Nevitt, with Huynh's improved general approximation (IGA) and the Brown-Forsythe (BF) multivariate approach in a mixed repeated measures design when normality and multisample sphericity assumptions do not hold.…

Descriptors: Sample Size, Comparative Analysis, Simulation, Multivariate Analysis

A Multivariate Generalizability Model for Clinical Skills Assessments

Peer reviewed

Direct link

Jarjoura, David; Early, Larry; Androulakakis, Voula – Educational and Psychological Measurement, 2004

Assessments of clinical skills of medical students rely increasingly on standardized patients demonstrating medical cases with faculty rating performance. The common finding of inconsistency of scores across cases is often referred to as case specificity. A multivariate generalizability model reveals that overall case specificity cannot explain…

Descriptors: Patients, Medical Students, Clinical Experience, Physician Patient Relationship

Zumbo, Bruno D.	3
Rupp, Andre A.	2
Androulakakis, Voula	1
Aylesworth, Richard	1
Diez, F. Javier Herrero	1
Early, Larry	1
Fan, Weihua	1
Garcia, M. Paula Fernandez	1
Hancock, Gregory R.	1
Huberty, Carl J.	1
Izquierdo, Marcelino Cuesta	1
Janson, Harald	1
Jarjoura, David	1
Kristjansson, Elizabeth	1
Lin, Pei-Ying	1
Lin, Yu-Cheng	1
Marcoulides, George A.	1
Mcdowell, Ian	1
Olsson, Ulf	1
Paek, Insu	1
Raykov, Tenko	1
Reckase, Mark D.	1
Seco, Guillermo Vallejo	1
Xu, Jing-Ru	1
Yesiltas, Gonca	1
More ▼