Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 15 |
Descriptor
Multiple Choice Tests | 37 |
Item Response Theory | 15 |
Test Items | 13 |
Comparative Analysis | 8 |
Evaluation Methods | 8 |
Equated Scores | 7 |
Guessing (Tests) | 7 |
Models | 7 |
Test Format | 7 |
Computation | 6 |
Scoring | 6 |
More ▼ |
Source
Applied Psychological… | 37 |
Author
Publication Type
Journal Articles | 34 |
Reports - Research | 17 |
Reports - Evaluative | 13 |
Reports - Descriptive | 4 |
Information Analyses | 1 |
Speeches/Meeting Papers | 1 |
Education Level
High Schools | 2 |
Higher Education | 2 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Armed Services Vocational… | 2 |
Advanced Placement… | 1 |
California Learning… | 1 |
Iowa Tests of Basic Skills | 1 |
What Works Clearinghouse Rating
Kalender, Ilker – Applied Psychological Measurement, 2012
catcher is a software program designed to compute the [omega] index, a common statistical index for the identification of collusions (cheating) among examinees taking an educational or psychological test. It requires (a) responses and (b) ability estimations of individuals, and (c) item parameters to make computations and outputs the results of…
Descriptors: Computer Software, Computation, Statistical Analysis, Cheating
Chiu, Ting-Wei; Camilli, Gregory – Applied Psychological Measurement, 2013
Guessing behavior is an issue discussed widely with regard to multiple choice tests. Its primary effect is on number-correct scores for examinees at lower levels of proficiency. This is a systematic error or bias, which increases observed test scores. Guessing also can inflate random error variance. Correction or adjustment for guessing formulas…
Descriptors: Item Response Theory, Guessing (Tests), Multiple Choice Tests, Error of Measurement
He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei – Applied Psychological Measurement, 2013
Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…
Descriptors: Regression (Statistics), Item Response Theory, Test Items, Equated Scores
Penfield, Randall D. – Applied Psychological Measurement, 2010
In 2008, Penfield showed that measurement invariance across all response options of a multiple-choice item (correct option and the "J" distractors) can be modeled using a nominal response model that included a differential distractor functioning (DDF) effect for each of the "J" distractors. This article extends this concept to consider how the…
Descriptors: Test Bias, Test Items, Models, Multiple Choice Tests
Moses, Tim; Deng, Weiling; Zhang, Yu-Li – Applied Psychological Measurement, 2011
Nonequivalent groups with anchor test (NEAT) equating functions that use a single anchor can have accuracy problems when the groups are extremely different and/or when the anchor weakly correlates with the tests being equated. Proposals have been made to address these issues by incorporating more than one anchor into NEAT equating functions. These…
Descriptors: Equated Scores, Tests, Comparative Analysis, Correlation
Lee, Jihyun; Corter, James E. – Applied Psychological Measurement, 2011
Diagnosis of misconceptions or "bugs" in procedural skills is difficult because of their unstable nature. This study addresses this problem by proposing and evaluating a probability-based approach to the diagnosis of bugs in children's multicolumn subtraction performance using Bayesian networks. This approach assumes a causal network relating…
Descriptors: Misconceptions, Probability, Children, Subtraction
Belov, Dmitry I.; Armstrong, Ronald D. – Applied Psychological Measurement, 2010
This article presents a new method to detect copying on a standardized multiple-choice exam. The method combines two statistical approaches in successive stages. The first stage uses Kullback-Leibler divergence to identify examinees, called subjects, who have demonstrated inconsistent performance during an exam. For each subject the second stage…
Descriptors: Multiple Choice Tests, Cheating, Statistical Analysis, Monte Carlo Methods
Wise, Steven L.; DeMars, Christine E. – Applied Psychological Measurement, 2009
Attali (2005) recently demonstrated that Cronbach's coefficient [alpha] estimate of reliability for number-right multiple-choice tests will tend to be deflated by speededness, rather than inflated as is commonly believed and taught. Although the methods, findings, and conclusions of Attali (2005) are correct, his article may inadvertently invite a…
Descriptors: Guessing (Tests), Multiple Choice Tests, Test Reliability, Computation
Belov, Dmitry I. – Applied Psychological Measurement, 2011
This article presents the Variable Match Index (VM-Index), a new statistic for detecting answer copying. The power of the VM-Index relies on two-dimensional conditioning as well as the structure of the test. The asymptotic distribution of the VM-Index is analyzed by reduction to Poisson trials. A computational study comparing the VM-Index with the…
Descriptors: Cheating, Journal Articles, Computation, Comparative Analysis
de la Torre, Jimmy – Applied Psychological Measurement, 2009
Cognitive or skills diagnosis models are discrete latent variable models developed specifically for the purpose of identifying the presence or absence of multiple fine-grained skills. However, applications of these models typically involve dichotomous or dichotomized data, including data from multiple-choice (MC) assessments that are scored as…
Descriptors: Cognitive Measurement, Thinking Skills, Identification, Multiple Choice Tests
Abad, Francisco J.; Olea, Julio; Ponsoda, Vicente – Applied Psychological Measurement, 2009
This article deals with some of the problems that have hindered the application of Samejima's and Thissen and Steinberg's multiple-choice models: (a) parameter estimation difficulties owing to the large number of parameters involved, (b) parameter identifiability problems in the Thissen and Steinberg model, and (c) their treatment of omitted…
Descriptors: Multiple Choice Tests, Models, Computation, Simulation
von Davier, Alina A.; Wilson, Christine – Applied Psychological Measurement, 2008
Dorans and Holland (2000) and von Davier, Holland, and Thayer (2003) introduced measures of the degree to which an observed-score equating function is sensitive to the population on which it is computed. This article extends the findings of Dorans and Holland and of von Davier et al. to item response theory (IRT) true-score equating methods that…
Descriptors: Advanced Placement, Advanced Placement Programs, Equated Scores, Calculus
Attali, Yigal – Applied Psychological Measurement, 2005
Contrary to common belief, reliability estimates of number-right multiple-choice tests are not inflated by speededness. Because examinees guess on questions when they run out of time, the responses to these questions generally show less consistency with the responses of other questions, and the reliability of the test will be decreased. The…
Descriptors: Reliability, Multiple Choice Tests
Petersen, Nancy S. – Applied Psychological Measurement, 2008
This article discusses the five studies included in this issue. Each article addressed the same topic, population invariance of equating. They all used data from major standardized testing programs, and they all used essentially the same statistics to evaluate their results, namely, the root mean square difference and root expected mean square…
Descriptors: Testing Programs, Standardized Tests, Equated Scores, Evaluation Methods

Kim, Jee-Seon; Hanson, Bradley A. – Applied Psychological Measurement, 2002
Presents a characteristic curve procedure for comparing transformations of the item response theory ability scale assuming the multiple-choice model. Illustrates the use of the method with an example equating American College Testing mathematics tests. (SLD)
Descriptors: Ability, Equated Scores, Item Response Theory, Mathematics Tests