ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	7

Descriptor

Comparative Analysis	11
Error of Measurement	11
Item Response Theory	5
Equations (Mathematics)	4
Statistical Analysis	4
Test Items	4
Computation	3
Models	3
Monte Carlo Methods	3
Computer Simulation	2
Equated Scores	2
Mathematical Models	2
Sample Size	2
Scores	2
Simulation	2
Test Length	2
Test Theory	2
Tests	2
Bias	1
Cheating	1
Classification	1
College Entrance Examinations	1
Correlation	1
Decision Making	1
Educational Testing	1
More ▼

Source

Applied Psychological…

Author

van der Linden, Wim J.	3
Berger, Martjin P. F.	1
Cui, Zhongmin	1
Culpepper, Steven Andrew	1
De Ayala, R. J.	1
Drasgow, Fritz	1
Ferdous, Abdullah A.	1
Finch, W. Holmes	1
Kim, Doyoung	1
Kim, Seonghoon	1
Kolen, Michael J.	1
Meijer, Rob R.	1
Mellenbergh, Gideon J.	1
Nering, Michael L.	1
Sotaridona, Leonardo S.	1
Stark, Stephen	1
Zeng, Lingjia	1
More ▼

Publication Type

Journal Articles	11
Reports - Research	7
Reports - Evaluative	3
Reports - Descriptive	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Law School Admission Test

What Works Clearinghouse Rating

Showing all 11 results Save | Export

The Reliability and Precision of Total Scores and IRT Estimates as a Function of Polytomous IRT Parameters and Latent Trait Distribution

Peer reviewed

Direct link

Culpepper, Steven Andrew – Applied Psychological Measurement, 2013

A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…

Descriptors: Item Response Theory, Reliability, Scores, Error of Measurement

The MIMIC Model as a Tool for Differential Bundle Functioning Detection

Peer reviewed

Direct link

Finch, W. Holmes – Applied Psychological Measurement, 2012

Increasingly, researchers interested in identifying potentially biased test items are encouraged to use a confirmatory, rather than exploratory, approach. One such method for confirmatory testing is rooted in differential bundle functioning (DBF), where hypotheses regarding potential differential item functioning (DIF) for sets of items (bundles)…

Descriptors: Test Bias, Test Items, Statistical Analysis, Models

The Comparative Performance of Conditional Independence Indices

Peer reviewed

Direct link

Kim, Doyoung; De Ayala, R. J.; Ferdous, Abdullah A.; Nering, Michael L. – Applied Psychological Measurement, 2011

To realize the benefits of item response theory (IRT), one must have model-data fit. One facet of a model-data fit investigation involves assessing the tenability of the conditional item independence (CII) assumption. In this Monte Carlo study, the comparative performance of 10 indices for identifying conditional item dependence is assessed. The…

Descriptors: Item Response Theory, Monte Carlo Methods, Error of Measurement, Statistical Analysis

An Extension of Least Squares Estimation of IRT Linking Coefficients for the Graded Response Model

Peer reviewed

Direct link

Kim, Seonghoon – Applied Psychological Measurement, 2010

The three types (generalized, unweighted, and weighted) of least squares methods, proposed by Ogasawara, for estimating item response theory (IRT) linking coefficients under dichotomous models are extended to the graded response model. A simulation study was conducted to confirm the accuracy of the extended formulas, and a real data study was…

Descriptors: Least Squares Statistics, Computation, Item Response Theory, Models

Comparison of Parametric and Nonparametric Bootstrap Methods for Estimating Random Error in Equipercentile Equating

Peer reviewed

Direct link

Cui, Zhongmin; Kolen, Michael J. – Applied Psychological Measurement, 2008

This article considers two methods of estimating standard errors of equipercentile equating: the parametric bootstrap method and the nonparametric bootstrap method. Using a simulation study, these two methods are compared under three sample sizes (300, 1,000, and 3,000), for two test content areas (the Iowa Tests of Basic Skills Maps and Diagrams…

Descriptors: Test Length, Test Content, Simulation, Computation

A Numerical Approach for Computing Standard Errors of Linear Equating.

Peer reviewed

Zeng, Lingjia – Applied Psychological Measurement, 1993

A numerical approach for computing standard errors (SEs) of a linear equating is described in which first partial derivatives of equating functions needed to compute SEs are derived numerically. Numerical and analytical approaches are compared using the Tucker equating method. SEs derived numerically are found indistinguishable from SEs derived…

Descriptors: Comparative Analysis, Computer Simulation, Equated Scores, Equations (Mathematics)

Equating Error in Observed-Score Equating

Peer reviewed

Direct link

van der Linden, Wim J. – Applied Psychological Measurement, 2006

Traditionally, error in equating observed scores on two versions of a test is defined as the difference between the transformations that equate the quantiles of their distributions in the sample and population of test takers. But it is argued that if the goal of equating is to adjust the scores of test takers on one version of the test to make…

Descriptors: Equated Scores, Evaluation Criteria, Models, Error of Measurement

Detecting Answer Copying Using the Kappa Statistic

Peer reviewed

Direct link

Sotaridona, Leonardo S.; van der Linden, Wim J.; Meijer, Rob R. – Applied Psychological Measurement, 2006

A statistical test for detecting answer copying on multiple-choice tests based on Cohen's kappa is proposed. The test is free of any assumptions on the response processes of the examinees suspected of copying and having served as the source, except for the usual assumption that these processes are probabilistic. Because the asymptotic null and…

Descriptors: Cheating, Test Items, Simulation, Statistical Analysis

An EM Approach to Parameter Estimation for the Zinnes and Griggs Paired Comparison IRT Model.

Peer reviewed

Stark, Stephen; Drasgow, Fritz – Applied Psychological Measurement, 2002

Describes item response and information functions for the Zinnes and Griggs paired comparison item response theory (IRT) model (1974) and presents procedures for estimating stimulus and person parameters. Monte Carlo simulations show that at least 400 ratings are required to obtain reasonably accurate estimates of the stimulus parameters and their…

Descriptors: Comparative Analysis, Computer Simulation, Error of Measurement, Item Response Theory

On the Efficiency of IRT Models When Applied to Different Sampling Designs.

Peer reviewed

Berger, Martjin P. F. – Applied Psychological Measurement, 1991

A generalized variance criterion is proposed to measure efficiency in item-response-theory (IRT) models. Heuristic arguments are given to formulate the efficiency of a design in terms of an asymptotic generalized variance criterion. Efficiencies of designs for one-, two-, and three-parameter models are compared. (SLD)

Descriptors: Comparative Analysis, Efficiency, Equations (Mathematics), Error of Measurement

The Internal and External Optimality of Decisions Based on Tests.

Peer reviewed

Mellenbergh, Gideon J.; van der Linden, Wim J. – Applied Psychological Measurement, 1979

For six tests, coefficient delta as an index for internal optimality is computed. Internal optimality is defined as the magnitude of risk of the decision procedure with respect to the true score. Results are compared with an alternative index (coefficient kappa) for assessing the consistency of decisions. (Author/JKS)

Descriptors: Classification, Comparative Analysis, Decision Making, Error of Measurement