ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	20

Descriptor

Evaluation Methods	30
Simulation	24
Item Response Theory	21
Computation	11
Models	10
Test Items	10
Comparative Analysis	8
Monte Carlo Methods	8
Computer Simulation	6
Maximum Likelihood Statistics	5
Measurement	5
Psychological Studies	5
Test Bias	5
Computer Assisted Testing	4
Error Patterns	4
Error of Measurement	4
Item Banks	4
Measurement Techniques	4
Scoring	4
Achievement Tests	3
Adaptive Testing	3
Correlation	3
Equated Scores	3
Equations (Mathematics)	3
Goodness of Fit	3
More ▼

Source

Applied Psychological…

Publication Type

Journal Articles	30
Reports - Research	16
Reports - Evaluative	11
Reports - Descriptive	3

Education Level

Higher Education

Audience

Practitioners	2
Researchers	2

Location

Maryland	1
Taiwan	1

Laws, Policies, & Programs

Assessments and Surveys

California Achievement Tests	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 30 results Save | Export

Recognizing Uncertainty in the Q-Matrix via a Bayesian Extension of the DINA Model

Peer reviewed

Direct link

DeCarlo, Lawrence T. – Applied Psychological Measurement, 2012

In the typical application of a cognitive diagnosis model, the Q-matrix, which reflects the theory with respect to the skills indicated by the items, is assumed to be known. However, the Q-matrix is usually determined by expert judgment, and so there can be uncertainty about some of its elements. Here it is shown that this uncertainty can be…

Descriptors: Bayesian Statistics, Item Response Theory, Simulation, Models

Exploratory Mokken Scale Analysis as a Dimensionality Assessment Tool: Why Scalability Does Not Imply Unidimensionality

Peer reviewed

Direct link

Smits, Iris A. M.; Timmerman, Marieke E.; Meijer, Rob R. – Applied Psychological Measurement, 2012

The assessment of the number of dimensions and the dimensionality structure of questionnaire data is important in scale evaluation. In this study, the authors evaluate two dimensionality assessment procedures in the context of Mokken scale analysis (MSA), using a so-called fixed lowerbound. The comparative simulation study, covering various…

Descriptors: Simulation, Measures (Individuals), Program Effectiveness, Item Response Theory

A Latent Class Approach to Estimating Test-Score Reliability

Peer reviewed

Direct link

van der Ark, L. Andries; van der Palm, Daniel W.; Sijtsma, Klaas – Applied Psychological Measurement, 2011

This study presents a general framework for single-administration reliability methods, such as Cronbach's alpha, Guttman's lambda-2, and method MS. This general framework was used to derive a new approach to estimating test-score reliability by means of the unrestricted latent class model. This new approach is the latent class reliability…

Descriptors: Simulation, Reliability, Measurement, Psychology

Coefficient Alpha Bootstrap Confidence Interval under Nonnormality

Peer reviewed

Direct link

Padilla, Miguel A.; Divers, Jasmin; Newton, Matthew – Applied Psychological Measurement, 2012

Three different bootstrap methods for estimating confidence intervals (CIs) for coefficient alpha were investigated. In addition, the bootstrap methods were compared with the most promising coefficient alpha CI estimation methods reported in the literature. The CI methods were assessed through a Monte Carlo simulation utilizing conditions…

Descriptors: Intervals, Monte Carlo Methods, Computation, Sampling

A Negative Binomial Regression Model for Accuracy Tests

Peer reviewed

Direct link

Hung, Lai-Fa – Applied Psychological Measurement, 2012

Rasch used a Poisson model to analyze errors and speed in reading tests. An important property of the Poisson distribution is that the mean and variance are equal. However, in social science research, it is very common for the variance to be greater than the mean (i.e., the data are overdispersed). This study embeds the Rasch model within an…

Descriptors: Social Science Research, Markov Processes, Reading Tests, Social Sciences

Iterative Linking with the Differential Functioning of Items and Tests (DFIT) Method: Comparison of Testwide and Item Parameter Replication (IPR) Critical Values

Peer reviewed

Direct link

Seybert, Jacob; Stark, Stephen – Applied Psychological Measurement, 2012

A Monte Carlo study was conducted to examine the accuracy of differential item functioning (DIF) detection using the differential functioning of items and tests (DFIT) method. Specifically, the performance of DFIT was compared using "testwide" critical values suggested by Flowers, Oshima, and Raju, based on simulations involving large numbers of…

Descriptors: Test Bias, Monte Carlo Methods, Form Classes (Languages), Simulation

Testing for Nonuniform Differential Item Functioning with Multiple Indicator Multiple Cause Models

Peer reviewed

Direct link

Woods, Carol M.; Grimm, Kevin J. – Applied Psychological Measurement, 2011

In extant literature, multiple indicator multiple cause (MIMIC) models have been presented for identifying items that display uniform differential item functioning (DIF) only, not nonuniform DIF. This article addresses, for apparently the first time, the use of MIMIC models for testing both uniform and nonuniform DIF with categorical indicators. A…

Descriptors: Test Bias, Testing, Interaction, Item Response Theory

Within-Subject Comparison of Changes in a Pretest-Posttest Design

Peer reviewed

Direct link

Hennig, Christian; Mullensiefen, Daniel; Bargmann, Jens – Applied Psychological Measurement, 2010

The authors propose a method to compare the influence of a treatment on different properties within subjects. The properties are measured by several Likert-type-scaled items. The results show that many existing approaches, such as repeated measurement analysis of variance on sum and mean scores, a linear partial credit model, and a graded response…

Descriptors: Simulation, Pretests Posttests, Regression (Statistics), Comparative Analysis

Ramsay-Curve Differential Item Functioning

Peer reviewed

Direct link

Woods, Carol M. – Applied Psychological Measurement, 2011

Differential item functioning (DIF) occurs when an item on a test, questionnaire, or interview has different measurement properties for one group of people versus another, irrespective of true group-mean differences on the constructs being measured. This article is focused on item response theory based likelihood ratio testing for DIF (IRT-LR or…

Descriptors: Simulation, Item Response Theory, Testing, Questionnaires

Item Selection and Hypothesis Testing for the Adaptive Measurement of Change

Peer reviewed

Direct link

Finkelman, Matthew D.; Weiss, David J.; Kim-Kang, Gyenam – Applied Psychological Measurement, 2010

Assessing individual change is an important topic in both psychological and educational measurement. An adaptive measurement of change (AMC) method had previously been shown to exhibit greater efficiency in detecting change than conventional nonadaptive methods. However, little work had been done to compare different procedures within the AMC…

Descriptors: Computer Assisted Testing, Hypothesis Testing, Measurement, Item Analysis

Empirical Selection of Anchors for Tests of Differential Item Functioning

Peer reviewed

Direct link

Woods, Carol M. – Applied Psychological Measurement, 2009

Differential item functioning (DIF) occurs when items on a test or questionnaire have different measurement properties for one group of people versus another, irrespective of group-mean differences on the construct. Methods for testing DIF require matching members of different groups on an estimate of the construct. Preferably, the estimate is…

Descriptors: Test Results, Testing, Item Response Theory, Test Bias

Anchor Test Type and Population Invariance: An Exploration across Subpopulations and Test Administrations

Peer reviewed

Direct link

Dorans, Neil J.; Liu, Jinghua; Hammond, Shelby – Applied Psychological Measurement, 2008

This exploratory study was built on research spanning three decades. Petersen, Marco, and Stewart (1982) conducted a major empirical investigation of the efficacy of different equating methods. The studies reported in Dorans (1990) examined how different equating methods performed across samples selected in different ways. Recent population…

Descriptors: Test Format, Equated Scores, Sampling, Evaluation Methods

Modified Likelihood-Based Item Fit Statistics for the Generalized Graded Unfolding Model

Peer reviewed

Direct link

Roberts, James S. – Applied Psychological Measurement, 2008

Orlando and Thissen (2000) developed an item fit statistic for binary item response theory (IRT) models known as S-X[superscript 2]. This article generalizes their statistic to polytomous unfolding models. Four alternative formulations of S-X[superscript 2] are developed for the generalized graded unfolding model (GGUM). The GGUM is a…

Descriptors: Item Response Theory, Goodness of Fit, Test Items, Models

Consequences of Ignoring Guessing when Estimating the Latent Density in Item Response Theory

Peer reviewed

Direct link

Woods, Carol M. – Applied Psychological Measurement, 2008

In Ramsay-curve item response theory (RC-IRT), the latent variable distribution is estimated simultaneously with the item parameters. In extant Monte Carlo evaluations of RC-IRT, the item response function (IRF) used to fit the data is the same one used to generate the data. The present simulation study examines RC-IRT when the IRF is imperfectly…

Descriptors: Simulation, Item Response Theory, Monte Carlo Methods, Comparative Analysis

Investigation of IRT-Based Equating Methods in the Presence of Outlier Common Items

Peer reviewed

Direct link

Hu, Huiqin; Rogers, W. Todd; Vukmirovic, Zarko – Applied Psychological Measurement, 2008

Common items with inconsistent b-parameter estimates may have a serious impact on item response theory (IRT)--based equating results. To find a better way to deal with the outlier common items with inconsistent b-parameters, the current study investigated the comparability of 10 variations of four IRT-based equating methods (i.e., concurrent…

Descriptors: Item Response Theory, Item Analysis, Computer Simulation, Equated Scores

Previous Page | Next Page »

Pages: 1 | 2

Woods, Carol M.	5
Roberts, James S.	2
Armstrong, Ronald D.	1
Bargmann, Jens	1
Belov, Dmitry I.	1
Bergeron, Jennifer M.	1
Dauvier, Bruno	1
DeCarlo, Lawrence T.	1
Divers, Jasmin	1
Dodd, Barbara	1
Dorans, Neil J.	1
Finch, Holmes	1
Finkelman, Matthew D.	1
Fitzpatrick, Steven	1
Gorin, Joanna	1
Grimm, Kevin J.	1
Habing, Brian	1
Hammond, Shelby	1
Hanson, Bradley A.	1
Harris, Deborah J.	1
Hennig, Christian	1
Houston, Walter M.	1
Hu, Huiqin	1
Hung, Lai-Fa	1
More ▼