NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 10 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Harik, Polina; Baldwin, Peter; Clauser, Brian – Applied Psychological Measurement, 2013
Growing reliance on complex constructed response items has generated considerable interest in automated scoring solutions. Many of these solutions are described in the literature; however, relatively few studies have been published that "compare" automated scoring strategies. Here, comparisons are made among five strategies for…
Descriptors: Computer Assisted Testing, Automation, Scoring, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E.; Reckase, Mark D. – Applied Psychological Measurement, 2011
An essential concern in the application of any equating procedure is determining whether tests can be considered equated after the tests have been placed onto a common scale. This article clarifies one equating criterion, the first-order equity property of equating, and develops a new method for evaluating equating that is linked to this…
Descriptors: Lawyers, Licensing Examinations (Professions), Testing Programs, Graphs
Peer reviewed Peer reviewed
Direct linkDirect link
Raymond, Mark R.; Harik, Polina; Clauser, Brian E. – Applied Psychological Measurement, 2011
Prior research indicates that the overall reliability of performance ratings can be improved by using ordinary least squares (OLS) regression to adjust for rater effects. The present investigation extends previous work by evaluating the impact of OLS adjustment on standard errors of measurement ("SEM") at specific score levels. In…
Descriptors: Performance Based Assessment, Licensing Examinations (Professions), Least Squares Statistics, Item Response Theory
Peer reviewed Peer reviewed
Gilmer, Jerry S. – Applied Psychological Measurement, 1989
The effects of test item disclosure on resulting examinee equated scores and population passing rates were studied for 5,000 examinees taking a professional licensing examination. Results suggest that the effects of disclosing depended on the nature of the released items. Specific effects on particular examinees are also discussed. (SLD)
Descriptors: Disclosure, Equated Scores, Licensing Examinations (Professions), Professional Education
Peer reviewed Peer reviewed
Sykes, Robert C.; Ito, Kyoko – Applied Psychological Measurement, 1997
Evaluated the equivalence of scores and one-parameter logistic model item difficulty estimates obtained from computer-based and paper-and-pencil forms of a licensure examination taken by 418 examinees. There was no effect of either order or mode of administration on the equivalences. (SLD)
Descriptors: Computer Assisted Testing, Estimation (Mathematics), Health Personnel, Item Response Theory
Peer reviewed Peer reviewed
Luecht, Richard M. – Applied Psychological Measurement, 1996
The example of a medical licensure test is used to demonstrate situations in which complex, integrated content must be balanced at the total test level for validity reasons, but items assigned to reportable subscore categories may be used under a multidimensional item response theory adaptive paradigm to improve subscore reliability. (SLD)
Descriptors: Adaptive Testing, Certification, Computer Assisted Testing, Licensing Examinations (Professions)
Peer reviewed Peer reviewed
Sheehan, Kathleen; Lewis, Charles – Applied Psychological Measurement, 1992
A procedure is introduced for determining the effect of testlet nonequivalence on operating characteristics of a testlet-based computerized mastery test (CMT). The procedure, which involves estimating the CMT decision rule twice with testlet likelihoods treated as equivalent or nonequivalent, is demonstrated with testlet pools from the Architect…
Descriptors: Bayesian Statistics, Computer Assisted Testing, Computer Simulation, Equations (Mathematics)
Peer reviewed Peer reviewed
Woodruff, David J.; Sawyer, Richard L. – Applied Psychological Measurement, 1989
Two methods--non-distributional and normal--are derived for estimating measures of pass-fail reliability. Both are based on the Spearman Brown formula and require only a single test administration. Results from a simulation (n=20,000 examinees) and a licensure examination (n=4,828 examinees) illustrate these methods. (SLD)
Descriptors: Equations (Mathematics), Estimation (Mathematics), Licensing Examinations (Professions), Measures (Individuals)
Peer reviewed Peer reviewed
Norcini, John; And Others – Applied Psychological Measurement, 1991
Effects of numbers of experts (NOEs) and common items (CIs) on the scaling of cutting scores from expert judgments were studied for 11,917 physicians taking 2 forms of a medical specialty examination. Increasing NOEs and CIs reduced error; beyond 5 experts and 25 CIs, error differences were small. (SLD)
Descriptors: Comparative Testing, Cutting Scores, Equated Scores, Estimation (Mathematics)
Peer reviewed Peer reviewed
Sireci, Stephen G.; Geisinger, Kurt F. – Applied Psychological Measurement, 1995
An expanded version of the method of content evaluation proposed by S. G. Sireci and K. F. Giesinger (1992) was evaluated with respect to a national licensure examination and a nationally standardized social studies achievement test. Two groups of 15 subject-matter experts rated the similarity and content relevance of the items. (SLD)
Descriptors: Achievement Tests, Cluster Analysis, Construct Validity, Content Validity