ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	17

Descriptor

Differences	17
Statistical Analysis	13
Test Items	7
Test Bias	6
Accuracy	4
Computation	4
Correlation	4
Sample Size	4
Difficulty Level	3
Item Response Theory	3
Models	3
Reliability	3
Ability	2
Comparative Analysis	2
Measurement	2
Measures (Individuals)	2
Scores	2
Test Format	2
Test Length	2
Academic Achievement	1
Achievement Tests	1
Aggression	1
Bayesian Statistics	1
Behavioral Science Research	1
Change	1
More ▼

Source

Educational and Psychological…

Publication Type

Journal Articles	17
Reports - Research	13
Reports - Descriptive	2
Reports - Evaluative	2

Education Level

Higher Education	2
Postsecondary Education	2
Secondary Education	2
Elementary Education	1
Grade 8	1
High Schools	1
Junior High Schools	1
Middle Schools	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

Evaluating Equating Methods for Varying Levels of Form Difference

Peer reviewed

Direct link

Ting Sun; Stella Yun Kim – Educational and Psychological Measurement, 2024

Equating is a statistical procedure used to adjust for the difference in form difficulty such that scores on those forms can be used and interpreted comparably. In practice, however, equating methods are often implemented without considering the extent to which two forms differ in difficulty. The study aims to examine the effect of the magnitude…

Descriptors: Difficulty Level, Data Interpretation, Equated Scores, High School Students

Making the A Priori Procedure Work for Differences between Means

Peer reviewed

Direct link

Trafimow, David; Wang, Cong; Wang, Tonghui – Educational and Psychological Measurement, 2020

Previous researchers have proposed the a priori procedure, whereby the researcher specifies, prior to data collection, how closely she wishes the sample means to approach corresponding population means, and the degree of confidence of meeting the specification. However, an important limitation of previous research is that researchers sometimes are…

Descriptors: Sampling, Statistical Analysis, Equations (Mathematics), Differences

Testing the Difference between Reliability Coefficients Alpha and Omega

Peer reviewed

Direct link

Deng, Lifang; Chan, Wai – Educational and Psychological Measurement, 2017

Reliable measurements are key to social science research. Multiple measures of reliability of the total score have been developed, including coefficient alpha, coefficient omega, the greatest lower bound reliability, and others. Among these, the coefficient alpha has been most widely used, and it is reported in nearly every study involving the…

Descriptors: Reliability, Statistical Analysis, Computation, Differences

A Comparison of Different Nonnormal Distributions in Growth Mixture Models

Peer reviewed

Direct link

Son, Sookyoung; Lee, Hyunjung; Jang, Yoona; Yang, Junyeong; Hong, Sehee – Educational and Psychological Measurement, 2019

The purpose of the present study is to compare nonnormal distributions (i.e., t, skew-normal, skew-t with equal skew and skew-t with unequal skew) in growth mixture models (GMMs) based on diverse conditions of a number of time points, sample sizes, and skewness for intercepts. To carry out this research, two simulation studies were conducted with…

Descriptors: Statistical Distributions, Statistical Analysis, Structural Equation Models, Comparative Analysis

Testing the Difference of Correlated Agreement Coefficients for Statistical Significance

Peer reviewed

Direct link

Gwet, Kilem L. – Educational and Psychological Measurement, 2016

This article addresses the problem of testing the difference between two correlated agreement coefficients for statistical significance. A number of authors have proposed methods for testing the difference between two correlated kappa coefficients, which require either the use of resampling methods or the use of advanced statistical modeling…

Descriptors: Differences, Correlation, Statistical Significance, Statistical Analysis

Multidimensional Extension of Multiple Indicators Multiple Causes Models to Detect DIF

Peer reviewed

Direct link

Lee, Soo; Bulut, Okan; Suh, Youngsuk – Educational and Psychological Measurement, 2017

A number of studies have found multiple indicators multiple causes (MIMIC) models to be an effective tool in detecting uniform differential item functioning (DIF) for individual items and item bundles. A recently developed MIMIC-interaction model is capable of detecting both uniform and nonuniform DIF in the unidimensional item response theory…

Descriptors: Test Bias, Test Items, Models, Item Response Theory

Can Reliability of Multiple Component Measuring Instruments Depend on Response Option Presentation Mode?

Peer reviewed

Direct link

Menold, Natalja; Raykov, Tenko – Educational and Psychological Measurement, 2016

This article examines the possible dependency of composite reliability on presentation format of the elements of a multi-item measuring instrument. Using empirical data and a recent method for interval estimation of group differences in reliability, we demonstrate that the reliability of an instrument need not be the same when polarity of the…

Descriptors: Test Reliability, Test Format, Test Items, Differences

The Matching Criterion Purification for Differential Item Functioning Analyses in a Large-Scale Assessment

Peer reviewed

Direct link

Lee, HyeSun; Geisinger, Kurt F. – Educational and Psychological Measurement, 2016

The current study investigated the impact of matching criterion purification on the accuracy of differential item functioning (DIF) detection in large-scale assessments. The three matching approaches for DIF analyses (block-level matching, pooled booklet matching, and equated pooled booklet matching) were employed with the Mantel-Haenszel…

Descriptors: Test Bias, Measurement, Accuracy, Statistical Analysis

The Interaction of Ability Differences and Guessing When Modeling Differential Item Functioning with the Rasch Model: Conventional and Tailored Calibration

Peer reviewed

Direct link

DeMars, Christine E.; Jurich, Daniel P. – Educational and Psychological Measurement, 2015

In educational testing, differential item functioning (DIF) statistics must be accurately estimated to ensure the appropriate items are flagged for inspection or removal. This study showed how using the Rasch model to estimate DIF may introduce considerable bias in the results when there are large group differences in ability (impact) and the data…

Descriptors: Test Bias, Guessing (Tests), Ability, Differences

Multidimensional Classification of Examinees Using the Mixture Random Weights Linear Logistic Test Model

Peer reviewed

Direct link

Choi, In-Hee; Wilson, Mark – Educational and Psychological Measurement, 2015

An essential feature of the linear logistic test model (LLTM) is that item difficulties are explained using item design properties. By taking advantage of this explanatory aspect of the LLTM, in a mixture extension of the LLTM, the meaning of latent classes is specified by how item properties affect item difficulties within each class. To improve…

Descriptors: Classification, Test Items, Difficulty Level, Statistical Analysis

Evaluating Ranking Strategies in Assessing Change when the Measures Differ across Time

Peer reviewed

Direct link

Moses, Tim; Kim, Sooyeon – Educational and Psychological Measurement, 2012

In this study, a ranking strategy was evaluated for comparing subgroups' change using identical, equated, and nonidentical measures. Four empirical data sets were evaluated, each of which contained examinees' scores on two occasions, where the two occasions' scores were obtained on a single identical measure, on two equated tests, and on two…

Descriptors: Testing, Change, Scores, Measures (Individuals)

Studying Differential Item Functioning via Latent Variable Modeling: A Note on a Multiple-Testing Procedure

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A.; Lee, Chun-Lung; Chang, Chi – Educational and Psychological Measurement, 2013

This note is concerned with a latent variable modeling approach for the study of differential item functioning in a multigroup setting. A multiple-testing procedure that can be used to evaluate group differences in response probabilities on individual items is discussed. The method is readily employed when the aim is also to locate possible…

Descriptors: Test Bias, Statistical Analysis, Models, Hypothesis Testing

Differences in Reaction to Immediate Feedback and Opportunity to Revise Answers for Multiple-Choice and Open-Ended Questions

Peer reviewed

Direct link

Attali, Yigal; Laitusis, Cara; Stone, Elizabeth – Educational and Psychological Measurement, 2016

There are many reasons to believe that open-ended (OE) and multiple-choice (MC) items elicit different cognitive demands of students. However, empirical evidence that supports this view is lacking. In this study, we investigated the reactions of test takers to an interactive assessment with immediate feedback and answer-revision opportunities for…

Descriptors: Test Items, Questioning Techniques, Differences, Student Reaction

Measurement Invariance for Latent Constructs in Multiple Populations: A Critical View and Refocus

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A.; Li, Cheng-Hsien – Educational and Psychological Measurement, 2012

Popular measurement invariance testing procedures for latent constructs evaluated by multiple indicators in distinct populations are revisited and discussed. A frequently used test of factor loading invariance is shown to possess serious limitations that in general preclude it from accomplishing its goal of ascertaining this invariance. A process…

Descriptors: Measurement, Statistical Analysis, Models, Behavioral Science Research

Investigating ESL Students' Performance on Outcomes Assessments in Higher Education

Peer reviewed

Direct link

Lakin, Joni M.; Elliott, Diane Cardenas; Liu, Ou Lydia – Educational and Psychological Measurement, 2012

Outcomes assessments are gaining great attention in higher education because of increased demand for accountability. These assessments are widely used by U.S. higher education institutions to measure students' college-level knowledge and skills, including students who speak English as a second language (ESL). For the past decade, the increasing…

Descriptors: College Outcomes Assessment, Achievement Tests, English Language Learners, College Students

Previous Page | Next Page »

Pages: 1 | 2

Raykov, Tenko	3
Marcoulides, George A.	2
Attali, Yigal	1
Bulut, Okan	1
Chan, Wai	1
Chang, Chi	1
Choi, In-Hee	1
Cui, Ying	1
DeMars, Christine E.	1
Deng, Lifang	1
Elliott, Diane Cardenas	1
Geisinger, Kurt F.	1
Gierl, Mark J.	1
Gwet, Kilem L.	1
Hong, Sehee	1
Jang, Yoona	1
Jurich, Daniel P.	1
Kim, Sooyeon	1
Kobrin, Jennifer L.	1
Laitusis, Cara	1
Lakin, Joni M.	1
Lee, Chun-Lung	1
Lee, HyeSun	1
Lee, Hyunjung	1
Lee, Soo	1
More ▼