ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	27

Descriptor

Test Items	110
Item Response Theory	61
Simulation	32
Test Construction	30
Computer Assisted Testing	29
Adaptive Testing	21
Mathematical Models	21
Models	19
Estimation (Mathematics)	17
Equations (Mathematics)	16
Comparative Analysis	15
Item Bias	15
Monte Carlo Methods	14
Item Banks	12
Selection	12
Statistical Analysis	12
Difficulty Level	11
Item Analysis	11
Scores	11
Algorithms	10
Bayesian Statistics	10
Computer Simulation	10
Error of Measurement	9
Identification	9
Nonparametric Statistics	9
More ▼

Source

Applied Psychological…

110

Publication Type

Journal Articles	110
Reports - Evaluative	110
Speeches/Meeting Papers	6
Information Analyses	1
Reports - Research	1

Education Level

Grade 8	2
High Schools	1
Secondary Education	1

Audience

Location

Canada	1
Israel	1
Taiwan	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

Armed Services Vocational…	2
National Assessment of…	2
ACT Assessment	1
Advanced Placement…	1
Armed Forces Qualification…	1
Graduate Record Examinations	1
Hidden Figures Test	1
Law School Admission Test	1
TerraNova Multiple Assessments	1

What Works Clearinghouse Rating

Showing 1 to 15 of 110 results Save | Export

Optimal Test Design with Rule-Based Item Generation

Peer reviewed

Direct link

Geerlings, Hanneke; van der Linden, Wim J.; Glas, Cees A. W. – Applied Psychological Measurement, 2013

Optimal test-design methods are applied to rule-based item generation. Three different cases of automated test design are presented: (a) test assembly from a pool of pregenerated, calibrated items; (b) test generation on the fly from a pool of calibrated item families; and (c) test generation on the fly directly from calibrated features defining…

Descriptors: Test Construction, Test Items, Item Banks, Automation

Comparing the Performance of Five Multidimensional CAT Selection Procedures with Different Stopping Rules

Peer reviewed

Direct link

Yao, Lihua – Applied Psychological Measurement, 2013

Through simulated data, five multidimensional computerized adaptive testing (MCAT) selection procedures with varying test lengths are examined and compared using different stopping rules. Fixed item exposure rates are used for all the items, and the Priority Index (PI) method is used for the content constraints. Two stopping rules, standard error…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Selection

Multidimensional Linking for Domain Scores and Overall Scores for Nonequivalent Groups

Peer reviewed

Direct link

Yao, Lihua – Applied Psychological Measurement, 2011

The No Child Left Behind Act requires state assessments to report not only overall scores but also domain scores. To see the information on students' overall achievement, progress, and detailed strengths and weaknesses, and thereby identify areas for improvement in educational quality, students' performances across years or across forms need to be…

Descriptors: Scores, Item Response Theory, Achievement Tests, Test Items

A Procedure for Controlling General Test Overlap in Computerized Adaptive Testing

Peer reviewed

Direct link

Chen, Shu-Ying – Applied Psychological Measurement, 2010

To date, exposure control procedures that are designed to control test overlap in computerized adaptive tests (CATs) are based on the assumption of item sharing between pairs of examinees. However, in practice, examinees may obtain test information from more than one previous test taker. This larger scope of information sharing needs to be…

Descriptors: Computer Assisted Testing, Adaptive Testing, Methods, Test Items

Estimating a Noncompensatory IRT Model Using Metropolis within Gibbs Sampling

Peer reviewed

Direct link

Babcock, Ben – Applied Psychological Measurement, 2011

Relatively little research has been conducted with the noncompensatory class of multidimensional item response theory (MIRT) models. A Monte Carlo simulation study was conducted exploring the estimation of a two-parameter noncompensatory item response theory (IRT) model. The estimation method used was a Metropolis-Hastings within Gibbs algorithm…

Descriptors: Item Response Theory, Sampling, Computation, Statistical Analysis

A Comment on Early Student Blunders on Computer-Based Adaptive Tests

Peer reviewed

Direct link

Green, Bert F. – Applied Psychological Measurement, 2011

This article refutes a recent claim that computer-based tests produce biased scores for very proficient test takers who make mistakes on one or two initial items and that the "bias" can be reduced by using a four-parameter IRT model. Because the same effect occurs with pattern scores on nonadaptive tests, the effect results from IRT scoring, not…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Bias, Item Response Theory

An Empirical Evaluation of the Slip Correction in the Four Parameter Logistic Models with Computerized Adaptive Testing

Peer reviewed

Direct link

Yen, Yung-Chin; Ho, Rong-Guey; Laio, Wen-Wei; Chen, Li-Ju; Kuo, Ching-Chin – Applied Psychological Measurement, 2012

In a selected response test, aberrant responses such as careless errors and lucky guesses might cause error in ability estimation because these responses do not actually reflect the knowledge that examinees possess. In a computerized adaptive test (CAT), these aberrant responses could further cause serious estimation error due to dynamic item…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Response Style (Tests)

Interpretation of the Three-Parameter Testlet Response Model and Information Function

Peer reviewed

Direct link

Ip, Edward H. – Applied Psychological Measurement, 2010

The testlet response model is designed for handling items that are clustered, such as those embedded within the same reading passage. Although the testlet is a powerful tool for handling item clusters in educational and psychological testing, the interpretations of its item parameters, the conditional correlation between item pairs, and the…

Descriptors: Item Response Theory, Models, Test Items, Correlation

On the Analysis of Fraction Subtraction Data: The DINA Model, Classification, Latent Class Sizes, and the Q-Matrix

Peer reviewed

Direct link

DeCarlo, Lawrence T. – Applied Psychological Measurement, 2011

Cognitive diagnostic models (CDMs) attempt to uncover latent skills or attributes that examinees must possess in order to answer test items correctly. The DINA (deterministic input, noisy "and") model is a popular CDM that has been widely used. It is shown here that a logistic version of the model can easily be fit with standard software for…

Descriptors: Bayesian Statistics, Computation, Cognitive Tests, Diagnostic Tests

A Comparison of Item Selection Techniques for Testlets

Peer reviewed

Direct link

Murphy, Daniel L.; Dodd, Barbara G.; Vaughn, Brandon K. – Applied Psychological Measurement, 2010

This study examined the performance of the maximum Fisher's information, the maximum posterior weighted information, and the minimum expected posterior variance methods for selecting items in a computerized adaptive testing system when the items were grouped in testlets. A simulation study compared the efficiency of ability estimation among the…

Descriptors: Simulation, Adaptive Testing, Item Analysis, Item Response Theory

A Method for the Comparison of Item Selection Rules in Computerized Adaptive Testing

Peer reviewed

Direct link

Barrada, Juan Ramon; Olea, Julio; Ponsoda, Vicente; Abad, Francisco Jose – Applied Psychological Measurement, 2010

In a typical study comparing the relative efficiency of two item selection rules in computerized adaptive testing, the common result is that they simultaneously differ in accuracy and security, making it difficult to reach a conclusion on which is the more appropriate rule. This study proposes a strategy to conduct a global comparison of two or…

Descriptors: Test Items, Simulation, Adaptive Testing, Item Analysis

The Effects of Referent Item Parameters on Differential Item Functioning Detection Using the Free Baseline Likelihood Ratio Test

Peer reviewed

Direct link

Lopez Rivas, Gabriel E.; Stark, Stephen; Chernyshenko, Oleksandr S. – Applied Psychological Measurement, 2009

The purpose of this simulation study is to investigate the effects of anchor subtest composition on the accuracy of item response theory (IRT) likelihood ratio (LR) differential item functioning (DIF) detection (Thissen, Steinberg, & Wainer, 1988). Here, the IRT LR test was implemented with a free baseline approach wherein a baseline model was…

Descriptors: Simulation, Item Response Theory, Test Bias, Test Items

Predictive Control of Speededness in Adaptive Testing

Peer reviewed

Direct link

van der Linden, Wim J. – Applied Psychological Measurement, 2009

An adaptive testing method is presented that controls the speededness of a test using predictions of the test takers' response times on the candidate items in the pool. Two different types of predictions are investigated: posterior predictions given the actual response times on the items already administered and posterior predictions that use the…

Descriptors: Simulation, Adaptive Testing, Vocational Aptitude, Bayesian Statistics

Commingled Samples: A Neglected Source of Bias in Reliability Analysis

Peer reviewed

Direct link

Waller, Niels G. – Applied Psychological Measurement, 2008

Reliability is a property of test scores from individuals who have been sampled from a well-defined population. Reliability indices, such as coefficient and related formulas for internal consistency reliability (KR-20, Hoyt's reliability), yield lower bound reliability estimates when (a) subjects have been sampled from a single population and when…

Descriptors: Test Items, Reliability, Scores, Psychometrics

Explaining and Controlling for the Psychometric Properties of Computer-Generated Figural Matrix Items

Peer reviewed

Direct link

Freund, Philipp Alexander; Hofer, Stefan; Holling, Heinz – Applied Psychological Measurement, 2008

Figural matrix items are a popular task type for assessing general intelligence (Spearman's g). Items of this kind can be constructed rationally, allowing the implementation of computerized generation algorithms. In this study, the influence of different task parameters on the degree of difficulty in matrix items was investigated. A sample of N =…

Descriptors: Test Items, Psychometrics, Internet, Matrices

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

van der Linden, Wim J.	5
Chang, Hua-Hua	4
Cohen, Allan S.	4
Kim, Seock-Ho	4
Meijer, Rob R.	4
Stocking, Martha L.	4
Baker, Frank B.	3
Chen, Shu-Ying	3
De Ayala, R. J.	3
Oshima, T. C.	3
Yao, Lihua	3
Ackerman, Terry A.	2
Camilli, Gregory	2
Cliff, Norman	2
Dodd, Barbara G.	2
Eggen, T. J. H. M.	2
Geisinger, Kurt F.	2
Laughlin, James E.	2
Lautenschlager, Gary J.	2
Miller, M. David	2
Nandakumar, Ratna	2
Narayanan, Pankaja	2
Raju, Nambury S.	2
Roberts, James S.	2
More ▼