Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 20 |
Descriptor
Evaluation Methods | 30 |
Simulation | 24 |
Item Response Theory | 21 |
Computation | 11 |
Models | 10 |
Test Items | 10 |
Comparative Analysis | 8 |
Monte Carlo Methods | 8 |
Computer Simulation | 6 |
Maximum Likelihood Statistics | 5 |
Measurement | 5 |
More ▼ |
Source
Applied Psychological… | 30 |
Author
Woods, Carol M. | 5 |
Roberts, James S. | 2 |
Armstrong, Ronald D. | 1 |
Bargmann, Jens | 1 |
Belov, Dmitry I. | 1 |
Bergeron, Jennifer M. | 1 |
Dauvier, Bruno | 1 |
DeCarlo, Lawrence T. | 1 |
Divers, Jasmin | 1 |
Dodd, Barbara | 1 |
Dorans, Neil J. | 1 |
More ▼ |
Publication Type
Journal Articles | 30 |
Reports - Research | 16 |
Reports - Evaluative | 11 |
Reports - Descriptive | 3 |
Education Level
Higher Education | 2 |
Audience
Practitioners | 2 |
Researchers | 2 |
Laws, Policies, & Programs
Assessments and Surveys
California Achievement Tests | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
DeCarlo, Lawrence T. – Applied Psychological Measurement, 2012
In the typical application of a cognitive diagnosis model, the Q-matrix, which reflects the theory with respect to the skills indicated by the items, is assumed to be known. However, the Q-matrix is usually determined by expert judgment, and so there can be uncertainty about some of its elements. Here it is shown that this uncertainty can be…
Descriptors: Bayesian Statistics, Item Response Theory, Simulation, Models
Smits, Iris A. M.; Timmerman, Marieke E.; Meijer, Rob R. – Applied Psychological Measurement, 2012
The assessment of the number of dimensions and the dimensionality structure of questionnaire data is important in scale evaluation. In this study, the authors evaluate two dimensionality assessment procedures in the context of Mokken scale analysis (MSA), using a so-called fixed lowerbound. The comparative simulation study, covering various…
Descriptors: Simulation, Measures (Individuals), Program Effectiveness, Item Response Theory
van der Ark, L. Andries; van der Palm, Daniel W.; Sijtsma, Klaas – Applied Psychological Measurement, 2011
This study presents a general framework for single-administration reliability methods, such as Cronbach's alpha, Guttman's lambda-2, and method MS. This general framework was used to derive a new approach to estimating test-score reliability by means of the unrestricted latent class model. This new approach is the latent class reliability…
Descriptors: Simulation, Reliability, Measurement, Psychology
Padilla, Miguel A.; Divers, Jasmin; Newton, Matthew – Applied Psychological Measurement, 2012
Three different bootstrap methods for estimating confidence intervals (CIs) for coefficient alpha were investigated. In addition, the bootstrap methods were compared with the most promising coefficient alpha CI estimation methods reported in the literature. The CI methods were assessed through a Monte Carlo simulation utilizing conditions…
Descriptors: Intervals, Monte Carlo Methods, Computation, Sampling
Hung, Lai-Fa – Applied Psychological Measurement, 2012
Rasch used a Poisson model to analyze errors and speed in reading tests. An important property of the Poisson distribution is that the mean and variance are equal. However, in social science research, it is very common for the variance to be greater than the mean (i.e., the data are overdispersed). This study embeds the Rasch model within an…
Descriptors: Social Science Research, Markov Processes, Reading Tests, Social Sciences
Seybert, Jacob; Stark, Stephen – Applied Psychological Measurement, 2012
A Monte Carlo study was conducted to examine the accuracy of differential item functioning (DIF) detection using the differential functioning of items and tests (DFIT) method. Specifically, the performance of DFIT was compared using "testwide" critical values suggested by Flowers, Oshima, and Raju, based on simulations involving large numbers of…
Descriptors: Test Bias, Monte Carlo Methods, Form Classes (Languages), Simulation
Woods, Carol M.; Grimm, Kevin J. – Applied Psychological Measurement, 2011
In extant literature, multiple indicator multiple cause (MIMIC) models have been presented for identifying items that display uniform differential item functioning (DIF) only, not nonuniform DIF. This article addresses, for apparently the first time, the use of MIMIC models for testing both uniform and nonuniform DIF with categorical indicators. A…
Descriptors: Test Bias, Testing, Interaction, Item Response Theory
Hennig, Christian; Mullensiefen, Daniel; Bargmann, Jens – Applied Psychological Measurement, 2010
The authors propose a method to compare the influence of a treatment on different properties within subjects. The properties are measured by several Likert-type-scaled items. The results show that many existing approaches, such as repeated measurement analysis of variance on sum and mean scores, a linear partial credit model, and a graded response…
Descriptors: Simulation, Pretests Posttests, Regression (Statistics), Comparative Analysis
Woods, Carol M. – Applied Psychological Measurement, 2011
Differential item functioning (DIF) occurs when an item on a test, questionnaire, or interview has different measurement properties for one group of people versus another, irrespective of true group-mean differences on the constructs being measured. This article is focused on item response theory based likelihood ratio testing for DIF (IRT-LR or…
Descriptors: Simulation, Item Response Theory, Testing, Questionnaires
Finkelman, Matthew D.; Weiss, David J.; Kim-Kang, Gyenam – Applied Psychological Measurement, 2010
Assessing individual change is an important topic in both psychological and educational measurement. An adaptive measurement of change (AMC) method had previously been shown to exhibit greater efficiency in detecting change than conventional nonadaptive methods. However, little work had been done to compare different procedures within the AMC…
Descriptors: Computer Assisted Testing, Hypothesis Testing, Measurement, Item Analysis
Woods, Carol M. – Applied Psychological Measurement, 2009
Differential item functioning (DIF) occurs when items on a test or questionnaire have different measurement properties for one group of people versus another, irrespective of group-mean differences on the construct. Methods for testing DIF require matching members of different groups on an estimate of the construct. Preferably, the estimate is…
Descriptors: Test Results, Testing, Item Response Theory, Test Bias
Dorans, Neil J.; Liu, Jinghua; Hammond, Shelby – Applied Psychological Measurement, 2008
This exploratory study was built on research spanning three decades. Petersen, Marco, and Stewart (1982) conducted a major empirical investigation of the efficacy of different equating methods. The studies reported in Dorans (1990) examined how different equating methods performed across samples selected in different ways. Recent population…
Descriptors: Test Format, Equated Scores, Sampling, Evaluation Methods
Roberts, James S. – Applied Psychological Measurement, 2008
Orlando and Thissen (2000) developed an item fit statistic for binary item response theory (IRT) models known as S-X[superscript 2]. This article generalizes their statistic to polytomous unfolding models. Four alternative formulations of S-X[superscript 2] are developed for the generalized graded unfolding model (GGUM). The GGUM is a…
Descriptors: Item Response Theory, Goodness of Fit, Test Items, Models
Woods, Carol M. – Applied Psychological Measurement, 2008
In Ramsay-curve item response theory (RC-IRT), the latent variable distribution is estimated simultaneously with the item parameters. In extant Monte Carlo evaluations of RC-IRT, the item response function (IRF) used to fit the data is the same one used to generate the data. The present simulation study examines RC-IRT when the IRF is imperfectly…
Descriptors: Simulation, Item Response Theory, Monte Carlo Methods, Comparative Analysis
Hu, Huiqin; Rogers, W. Todd; Vukmirovic, Zarko – Applied Psychological Measurement, 2008
Common items with inconsistent b-parameter estimates may have a serious impact on item response theory (IRT)--based equating results. To find a better way to deal with the outlier common items with inconsistent b-parameters, the current study investigated the comparability of 10 variations of four IRT-based equating methods (i.e., concurrent…
Descriptors: Item Response Theory, Item Analysis, Computer Simulation, Equated Scores
Previous Page | Next Page ยป
Pages: 1 | 2