NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)11
Since 2006 (last 20 years)30
Education Level
Secondary Education1
Audience
Researchers1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 40 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Hoofs, Huub; van de Schoot, Rens; Jansen, Nicole W. H.; Kant, IJmert – Educational and Psychological Measurement, 2018
Bayesian confirmatory factor analysis (CFA) offers an alternative to frequentist CFA based on, for example, maximum likelihood estimation for the assessment of reliability and validity of educational and psychological measures. For increasing sample sizes, however, the applicability of current fit statistics evaluating model fit within Bayesian…
Descriptors: Goodness of Fit, Bayesian Statistics, Factor Analysis, Sample Size
Peer reviewed Peer reviewed
Direct linkDirect link
Rutkowski, Leslie; Svetina, Dubravka – Applied Measurement in Education, 2017
In spite of the challenges inherent in making dozens of comparisons across heterogeneous populations, a relatively recent interest in scale-score equivalence for non-achievement measures in an international context has emerged. Until recently, operational procedures for establishing measurement invariance using multiple-groups analyses were…
Descriptors: International Assessment, Goodness of Fit, Statistical Analysis, Teacher Surveys
Peer reviewed Peer reviewed
Direct linkDirect link
Kang, Yoonjeong; McNeish, Daniel M.; Hancock, Gregory R. – Educational and Psychological Measurement, 2016
Although differences in goodness-of-fit indices (?GOFs) have been advocated for assessing measurement invariance, studies that advanced recommended differential cutoffs for adjudicating invariance actually utilized a very limited range of values representing the quality of indicator variables (i.e., magnitude of loadings). Because quality of…
Descriptors: Measurement, Goodness of Fit, Guidelines, Models
Sinharay, Sandip – Grantee Submission, 2018
Tatsuoka (1984) suggested several extended caution indices and their standardized versions that have been used as person-fit statistics by researchers such as Drasgow, Levine, and McLaughlin (1987), Glas and Meijer (2003), and Molenaar and Hoijtink (1990). However, these indices are only defined for tests with dichotomous items. This paper extends…
Descriptors: Test Format, Goodness of Fit, Item Response Theory, Error Patterns
Peer reviewed Peer reviewed
Direct linkDirect link
Letué, Frédérique; Martinez, Marie-José; Samson, Adeline; Vilain, Anne; Vilain, Coriandre – Journal of Speech, Language, and Hearing Research, 2018
Purpose: Repeated duration data are frequently used in behavioral studies. Classical linear or log-linear mixed models are often inadequate to analyze such data, because they usually consist of nonnegative and skew-distributed variables. Therefore, we recommend use of a statistical methodology specific to duration data. Method: We propose a…
Descriptors: Behavioral Science Research, Research Methodology, Statistical Analysis, Repetition
Peer reviewed Peer reviewed
Direct linkDirect link
Zeller, Florian; Krampen, Dorothea; Reiß, Siegbert; Schweizer, Karl – Educational and Psychological Measurement, 2017
The item-position effect describes how an item's position within a test, that is, the number of previous completed items, affects the response to this item. Previously, this effect was represented by constraints reflecting simple courses, for example, a linear increase. Due to the inflexibility of these representations our aim was to examine…
Descriptors: Goodness of Fit, Simulation, Factor Analysis, Intelligence Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Lamprianou, Iasonas – Educational and Psychological Measurement, 2018
It is common practice for assessment programs to organize qualifying sessions during which the raters (often known as "markers" or "judges") demonstrate their consistency before operational rating commences. Because of the high-stakes nature of many rating activities, the research community tends to continuously explore new…
Descriptors: Social Networks, Network Analysis, Comparative Analysis, Innovation
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2016
Meijer and van Krimpen-Stoop noted that the number of person-fit statistics (PFSs) that have been designed for computerized adaptive tests (CATs) is relatively modest. This article partially addresses that concern by suggesting three new PFSs for CATs. The statistics are based on tests for a change point and can be used to detect an abrupt change…
Descriptors: Computer Assisted Testing, Adaptive Testing, Item Response Theory, Goodness of Fit
Peer reviewed Peer reviewed
Direct linkDirect link
Suh, Youngsuk – Journal of Educational Measurement, 2016
This study adapted an effect size measure used for studying differential item functioning (DIF) in unidimensional tests and extended the measure to multidimensional tests. Two effect size measures were considered in a multidimensional item response theory model: signed weighted P-difference and unsigned weighted P-difference. The performance of…
Descriptors: Effect Size, Goodness of Fit, Statistical Analysis, Statistical Significance
Peer reviewed Peer reviewed
Direct linkDirect link
Yang, Ji Seung; Zheng, Xiaying – Journal of Educational and Behavioral Statistics, 2018
The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…
Descriptors: Item Response Theory, Item Analysis, Computer Software, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Liu, Min; Hancock, Gregory R. – Educational and Psychological Measurement, 2014
Growth mixture modeling has gained much attention in applied and methodological social science research recently, but the selection of the number of latent classes for such models remains a challenging issue, especially when the assumption of proper model specification is violated. The current simulation study compared the performance of a linear…
Descriptors: Models, Classification, Simulation, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Liang, Longjuan; Browne, Michael W. – Journal of Educational and Behavioral Statistics, 2015
If standard two-parameter item response functions are employed in the analysis of a test with some newly constructed items, it can be expected that, for some items, the item response function (IRF) will not fit the data well. This lack of fit can also occur when standard IRFs are fitted to personality or psychopathology items. When investigating…
Descriptors: Item Response Theory, Statistical Analysis, Goodness of Fit, Bayesian Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Ranger, Jochen; Kuhn, Jörg-Tobias – Journal of Educational and Behavioral Statistics, 2015
In this article, a latent trait model is proposed for the response times in psychological tests. The latent trait model is based on the linear transformation model and subsumes popular models from survival analysis, like the proportional hazards model and the proportional odds model. Core of the model is the assumption that an unspecified monotone…
Descriptors: Psychological Testing, Reaction Time, Statistical Analysis, Models
Crawford, Aaron – ProQuest LLC, 2014
This simulation study compared the utility of various discrepancy measures within a posterior predictive model checking (PPMC) framework for detecting different types of data-model misfit in multidimensional Bayesian network (BN) models. The investigated conditions were motivated by an applied research program utilizing an operational complex…
Descriptors: Bayesian Statistics, Networks, Models, Goodness of Fit
Peer reviewed Peer reviewed
Direct linkDirect link
de la Torre, Jimmy; Lee, Young-Sun – Journal of Educational Measurement, 2013
This article used the Wald test to evaluate the item-level fit of a saturated cognitive diagnosis model (CDM) relative to the fits of the reduced models it subsumes. A simulation study was carried out to examine the Type I error and power of the Wald test in the context of the G-DINA model. Results show that when the sample size is small and a…
Descriptors: Statistical Analysis, Test Items, Goodness of Fit, Error of Measurement
Previous Page | Next Page »
Pages: 1  |  2  |  3