Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 5 |
Descriptor
Correlation | 6 |
Item Response Theory | 3 |
Hypothesis Testing | 2 |
Monte Carlo Methods | 2 |
Performance Based Assessment | 2 |
Reliability | 2 |
Testing | 2 |
Algebra | 1 |
Automation | 1 |
College Mathematics | 1 |
Comparative Analysis | 1 |
More ▼ |
Source
Applied Psychological… | 6 |
Author
Baldwin, Peter | 1 |
Christensen, Karl Bang | 1 |
Cicchetti, Domenic V. | 1 |
Clauser, Brian | 1 |
Finch, Holmes | 1 |
Fleiss, Joseph L. | 1 |
Gao, Rui | 1 |
Habing, Brian | 1 |
Harik, Polina | 1 |
Kluge, Annette | 1 |
Kreiner, Svend | 1 |
More ▼ |
Publication Type
Journal Articles | 5 |
Reports - Evaluative | 4 |
Reports - Research | 1 |
Education Level
Higher Education | 1 |
Audience
Location
Germany | 1 |
Laws, Policies, & Programs
Assessments and Surveys
United States Medical… | 1 |
What Works Clearinghouse Rating
Harik, Polina; Baldwin, Peter; Clauser, Brian – Applied Psychological Measurement, 2013
Growing reliance on complex constructed response items has generated considerable interest in automated scoring solutions. Many of these solutions are described in the literature; however, relatively few studies have been published that "compare" automated scoring strategies. Here, comparisons are made among five strategies for…
Descriptors: Computer Assisted Testing, Automation, Scoring, Comparative Analysis
Finch, Holmes; Habing, Brian – Applied Psychological Measurement, 2007
This Monte Carlo study compares the ability of the parametric bootstrap version of DIMTEST with three goodness-of-fit tests calculated from a fitted NOHARM model to detect violations of the assumption of unidimensionality in testing data. The effectiveness of the procedures was evaluated for different numbers of items, numbers of examinees,…
Descriptors: Guessing (Tests), Testing, Statistics, Monte Carlo Methods
Kluge, Annette – Applied Psychological Measurement, 2008
The use of microworlds (MWs), or complex dynamic systems, in educational testing and personnel selection is hampered by systematic measurement errors because these new and innovative item formats are not adequately controlled for their difficulty. This empirical study introduces a way to operationalize an MW's difficulty and demonstrates the…
Descriptors: Personnel Selection, Self Efficacy, Educational Testing, Computer Uses in Education
Yang, Wen-Ling; Gao, Rui – Applied Psychological Measurement, 2008
This study investigates whether the functions linking number-correct scores to the College-Level Examination Program (CLEP) scaled scores remain invariant over gender groups, using test data on the 16 testlet-based forms of the CLEP College Algebra exam. To be consistent with the operational practice, linking of various test forms to a common…
Descriptors: Mathematics Tests, Algebra, Item Response Theory, Testing Programs
Christensen, Karl Bang; Kreiner, Svend – Applied Psychological Measurement, 2007
Many statistical tests are designed to test the different assumptions of the Rasch model, but only few are directed at detecting multidimensionality. The Martin-Lof test is an attractive approach, the disadvantage being that its null distribution deviates strongly from the asymptotic chi-square distribution for most realistic sample sizes. A Monte…
Descriptors: Item Response Theory, Monte Carlo Methods, Testing, Models

Fleiss, Joseph L.; Cicchetti, Domenic V. – Applied Psychological Measurement, 1978
The accuracy of the large sample standard error of weighted kappa appropriate to the non-null case was studied by computer simulation for the hypothesis that two independently derived estimates of weighted kappa are equal, and for setting confidence limits around a single value of weighted kappa. (Author/CTM)
Descriptors: Correlation, Hypothesis Testing, Nonparametric Statistics, Reliability