ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	27

Descriptor

Test Length	45
Item Response Theory	29
Computation	17
Simulation	17
Sample Size	14
Monte Carlo Methods	12
Maximum Likelihood Statistics	11
Adaptive Testing	10
Error of Measurement	9
Models	9
Test Items	9
Comparative Analysis	8
Computer Assisted Testing	8
Bayesian Statistics	7
Classification	6
Scores	6
Nonparametric Statistics	5
Test Bias	5
Test Construction	5
Test Reliability	5
Ability	4
Accuracy	4
Computer Software	4
Foreign Countries	4
Item Analysis	4
More ▼

Source

Applied Psychological…

Publication Type

Journal Articles	45
Reports - Evaluative	21
Reports - Research	20
Reports - Descriptive	2
Collected Works - Serials	1
Reports - General	1

Education Level

High Schools	1
Secondary Education	1

Audience

Location

Netherlands	2
Australia	1
Michigan	1
Taiwan	1

Laws, Policies, & Programs

Assessments and Surveys

Armed Forces Qualification…	1
Center for Epidemiologic…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 45 results Save | Export

Two Approaches to Estimation of Classification Accuracy Rate under Item Response Theory

Peer reviewed

Direct link

Lathrop, Quinn N.; Cheng, Ying – Applied Psychological Measurement, 2013

Within the framework of item response theory (IRT), there are two recent lines of work on the estimation of classification accuracy (CA) rate. One approach estimates CA when decisions are made based on total sum scores, the other based on latent trait estimates. The former is referred to as the Lee approach, and the latter, the Rudner approach,…

Descriptors: Item Response Theory, Accuracy, Classification, Computation

The Influence of Item Calibration Error on Variable-Length Computerized Adaptive Testing

Peer reviewed

Direct link

Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi – Applied Psychological Measurement, 2013

Variable-length computerized adaptive testing (VL-CAT) allows both items and test length to be "tailored" to examinees, thereby achieving the measurement goal (e.g., scoring precision or classification) with as few items as possible. Several popular test termination rules depend on the standard error of the ability estimate, which in turn depends…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Length, Ability

Using Logistic Approximations of Marginal Trace Lines to Develop Short Assessments

Peer reviewed

Direct link

Stucky, Brian D.; Thissen, David; Edelen, Maria Orlando – Applied Psychological Measurement, 2013

Test developers often need to create unidimensional scales from multidimensional data. For item analysis, "marginal trace lines" capture the relation with the general dimension while accounting for nuisance dimensions and may prove to be a useful technique for creating short-form tests. This article describes the computations needed to obtain…

Descriptors: Test Construction, Test Length, Item Analysis, Item Response Theory

Effects of Vertical Scaling Methods on Linear Growth Estimation

Peer reviewed

Direct link

Lei, Pui-Wa; Zhao, Yu – Applied Psychological Measurement, 2012

Vertical scaling is necessary to facilitate comparison of scores from test forms of different difficulty levels. It is widely used to enable the tracking of student growth in academic performance over time. Most previous studies on vertical scaling methods assume relatively long tests and large samples. Little is known about their performance when…

Descriptors: Scaling, Item Response Theory, Test Length, Sample Size

Deriving Stopping Rules for Multidimensional Computerized Adaptive Testing

Peer reviewed

Direct link

Wang, Chun; Chang, Hua-Hua; Boughton, Keith A. – Applied Psychological Measurement, 2013

Multidimensional computerized adaptive testing (MCAT) is able to provide a vector of ability estimates for each examinee, which could be used to provide a more informative profile of an examinee's performance. The current literature on MCAT focuses on the fixed-length tests, which can generate less accurate results for those examinees whose…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Length, Item Banks

The Probability of Exceedance as a Nonparametric Person-Fit Statistic for Tests of Moderate Length

Peer reviewed

Direct link

Tendeiro, Jorge N.; Meijer, Rob R. – Applied Psychological Measurement, 2013

To classify an item score pattern as not fitting a nonparametric item response theory (NIRT) model, the probability of exceedance (PE) of an observed response vector x can be determined as the sum of the probabilities of all response vectors that are, at most, as likely as x, conditional on the test's total score. Vector x is to be considered…

Descriptors: Probability, Nonparametric Statistics, Goodness of Fit, Test Length

The Random-Threshold Generalized Unfolding Model and Its Application of Computerized Adaptive Testing

Peer reviewed

Direct link

Wang, Wen-Chung; Liu, Chen-Wei; Wu, Shiu-Lien – Applied Psychological Measurement, 2013

The random-threshold generalized unfolding model (RTGUM) was developed by treating the thresholds in the generalized unfolding model as random effects rather than fixed effects to account for the subjective nature of the selection of categories in Likert items. The parameters of the new model can be estimated with the JAGS (Just Another Gibbs…

Descriptors: Computer Assisted Testing, Adaptive Testing, Models, Bayesian Statistics

Comparing the Performance of Five Multidimensional CAT Selection Procedures with Different Stopping Rules

Peer reviewed

Direct link

Yao, Lihua – Applied Psychological Measurement, 2013

Through simulated data, five multidimensional computerized adaptive testing (MCAT) selection procedures with varying test lengths are examined and compared using different stopping rules. Fixed item exposure rates are used for all the items, and the Priority Index (PI) method is used for the content constraints. Two stopping rules, standard error…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Selection

A Comparison of Four Methods of IRT Subscoring

Peer reviewed

Direct link

de la Torre, Jimmy; Song, Hao; Hong, Yuan – Applied Psychological Measurement, 2011

Lack of sufficient reliability is the primary impediment for generating and reporting subtest scores. Several current methods of subscore estimation do so either by incorporating the correlational structure among the subtest abilities or by using the examinee's performance on the overall test. This article conducted a systematic comparison of four…

Descriptors: Item Response Theory, Scoring, Methods, Comparative Analysis

A Test-Length Correction to the Estimation of Extreme Proficiency Levels

Peer reviewed

Direct link

Magis, David; Beland, Sebastien; Raiche, Gilles – Applied Psychological Measurement, 2011

In this study, the estimation of extremely large or extremely small proficiency levels, given the item parameters of a logistic item response model, is investigated. On one hand, the estimation of proficiency levels by maximum likelihood (ML), despite being asymptotically unbiased, may yield infinite estimates. On the other hand, with an…

Descriptors: Test Length, Computation, Item Response Theory, Maximum Likelihood Statistics

Curtailment and Stochastic Curtailment to Shorten the CES-D

Peer reviewed

Direct link

Finkelman, Matthew D.; Smits, Niels; Kim, Wonsuk; Riley, Barth – Applied Psychological Measurement, 2012

The Center for Epidemiologic Studies-Depression (CES-D) scale is a well-known self-report instrument that is used to measure depressive symptomatology. Respondents who take the full-length version of the CES-D are administered a total of 20 items. This article investigates the use of curtailment and stochastic curtailment (SC), two sequential…

Descriptors: Measures (Individuals), Depression (Psychology), Test Length, Computer Assisted Testing

Evaluating EIV, OLS, and SEM Estimators of Group Slope Differences in the Presence of Measurement Error: The Single-Indicator Case

Peer reviewed

Direct link

Culpepper, Steven Andrew – Applied Psychological Measurement, 2012

Measurement error significantly biases interaction effects and distorts researchers' inferences regarding interactive hypotheses. This article focuses on the single-indicator case and shows how to accurately estimate group slope differences by disattenuating interaction effects with errors-in-variables (EIV) regression. New analytic findings were…

Descriptors: Evidence, Test Length, Interaction, Regression (Statistics)

An Evaluation of Item Response Theory Classification Accuracy and Consistency Indices

Peer reviewed

Direct link

Wyse, Adam E.; Hao, Shiqi – Applied Psychological Measurement, 2012

This article introduces two new classification consistency indices that can be used when item response theory (IRT) models have been applied. The new indices are shown to be related to Rudner's classification accuracy index and Guo's classification accuracy index. The Rudner- and Guo-based classification accuracy and consistency indices are…

Descriptors: Item Response Theory, Classification, Accuracy, Reliability

A Comparison of Bias Correction Adjustments for the DETECT Procedure

Peer reviewed

Direct link

Nandakumar, Ratna; Yu, Feng; Zhang, Yanwei – Applied Psychological Measurement, 2011

DETECT is a nonparametric methodology to identify the dimensional structure underlying test data. The associated DETECT index, "D[subscript max]," denotes the degree of multidimensionality in data. Conditional covariances (CCOV) are the building blocks of this index. In specifying population CCOVs, the latent test composite [theta][subscript TT]…

Descriptors: Nonparametric Statistics, Statistical Analysis, Tests, Data

Marginal Maximum A Posteriori Item Parameter Estimation for the Generalized Graded Unfolding Model

Peer reviewed

Direct link

Roberts, James S.; Thompson, Vanessa M. – Applied Psychological Measurement, 2011

A marginal maximum a posteriori (MMAP) procedure was implemented to estimate item parameters in the generalized graded unfolding model (GGUM). Estimates from the MMAP method were compared with those derived from marginal maximum likelihood (MML) and Markov chain Monte Carlo (MCMC) procedures in a recovery simulation that varied sample size,…

Descriptors: Statistical Analysis, Markov Processes, Computation, Monte Carlo Methods

Previous Page | Next Page »

Pages: 1 | 2 | 3

Finch, Holmes	3
Meijer, Rob R.	3
de la Torre, Jimmy	3
Chang, Hua-Hua	2
Cheng, Ying	2
Song, Hao	2
Stark, Stephen	2
Wang, Wen-Chung	2
Woods, Carol M.	2
Bechger, Timo M.	1
Beland, Sebastien	1
Bell, Richard	1
Boughton, Keith A.	1
Chernyshenko, Oleksandr S.	1
Cliff, Norman	1
Cudeck, Robert	1
Cui, Zhongmin	1
Culpepper, Steven Andrew	1
De Ayala, R. J.	1
Diao, Qi	1
Drasgow, Fritz	1
Due, Allan M.	1
Edelen, Maria Orlando	1
Finkelman, Matthew D.	1
More ▼