NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 11 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Keller, Lisa A.; Clauser, Brian E.; Swanson, David B. – Advances in Health Sciences Education, 2010
In recent years, demand for performance assessments has continued to grow. However, performance assessments are notorious for lower reliability, and in particular, low reliability resulting from task specificity. Since reliability analyses typically treat the performance tasks as randomly sampled from an infinite universe of tasks, these estimates…
Descriptors: Generalizability Theory, Test Reliability, Performance Based Assessment, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Raymond, Mark R.; Harik, Polina; Clauser, Brian E. – Applied Psychological Measurement, 2011
Prior research indicates that the overall reliability of performance ratings can be improved by using ordinary least squares (OLS) regression to adjust for rater effects. The present investigation extends previous work by evaluating the impact of OLS adjustment on standard errors of measurement ("SEM") at specific score levels. In…
Descriptors: Performance Based Assessment, Licensing Examinations (Professions), Least Squares Statistics, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Harik, Polina; Clauser, Brian E.; Grabovsky, Irina; Nungester, Ronald J.; Swanson, Dave; Nandakumar, Ratna – Journal of Educational Measurement, 2009
The present study examined the long-term usefulness of estimated parameters used to adjust the scores from a performance assessment to account for differences in rater stringency. Ratings from four components of the USMLE[R] Step 2 Clinical Skills Examination data were analyzed. A generalizability-theory framework was used to examine the extent to…
Descriptors: Generalizability Theory, Performance Based Assessment, Performance Tests, Clinical Experience
Peer reviewed Peer reviewed
Clauser, Brian E. – Applied Psychological Measurement, 2000
Provides a conceptual framework for the development of scoring procedures for performance assessments. The framework considers: (1) aspects of the performance to be scored; (2) criteria to evaluate aspects of the performance; (3) development of scoring criteria; and (4) application of scoring criteria. (SLD)
Descriptors: Criteria, Models, Performance Based Assessment, Scoring
Peer reviewed Peer reviewed
Clauser, Brian E.; Kane, Michael T.; Swanson, David B. – Applied Measurement in Education, 2002
Attempts to place the issues associated with computer-automated scoring within the context of current validity theory and presents a taxonomy of automated scoring procedures as a framework for discussing threats to validity that may take on increased importance for specific approaches to automated scoring. (SLD)
Descriptors: Classification, Computer Uses in Education, Performance Based Assessment, Test Construction
Peer reviewed Peer reviewed
Clauser, Brian E.; Clyman, Stephen G.; Swanson, David B. – Journal of Educational Measurement, 1999
Two studies focused on aspects of the rating process in performance assessment. The first, which involved 15 raters and about 400 medical students, made the "committee" facet of raters working in groups explicit, and the second, which involved about 200 medical students and four raters, made the "rating-occasion" facet…
Descriptors: Error Patterns, Evaluation Methods, Evaluators, Higher Education
Peer reviewed Peer reviewed
Clauser, Brian E.; Harik, Polina; Clyman, Stephen G. – Journal of Educational Measurement, 2000
Used generalizability theory to assess the impact of using independent, randomly equivalent groups of experts to develop scoring algorithms for computer simulation tasks designed to measure physicians' patient management skills. Results with three groups of four medical school faculty members each suggest that the impact of the expert group may be…
Descriptors: Computer Simulation, Generalizability Theory, Performance Based Assessment, Physicians
Peer reviewed Peer reviewed
Clauser, Brian E.; Margolis, Melissa J.; Clyman, Stephen G.; Ross, Linette P. – Journal of Educational Measurement, 1997
Research on automated scoring is extended by comparing alternative automated systems for scoring a computer simulation of physicians' patient management skills. A regression-based system is more highly correlated with experts' evaluations than a system that uses complex rules to map performances into score levels, but both approaches are feasible.…
Descriptors: Algorithms, Automation, Comparative Analysis, Computer Assisted Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Clauser, Brian E.; Harik, Polina; Margolis, Melissa J. – Journal of Educational Measurement, 2006
Although multivariate generalizability theory was developed more than 30 years ago, little published research utilizing this framework exists and most of what does exist examines tests built from tables of specifications. In this context, it is assumed that the universe scores from levels of the fixed multivariate facet will be correlated, but the…
Descriptors: Multivariate Analysis, Job Skills, Correlation, Test Items
Peer reviewed Peer reviewed
Clauser, Brian E.; Ross, Linette P.; Clyman, Stephen G.; Rose, Kathie M.; Margolis, Melissa J.; Nungester, Ronald J.; Piemme, Thomas E.; Chang, Lucy; El-Bayoumi, Gigi; Malakoff, Gary L.; Pincetl, Pierre S. – Applied Measurement in Education, 1997
Describes an automated scoring algorithm for a computer-based simulation examination of physicians' patient-management skills. Results with 280 medical students show that scores produced using this algorithm are highly correlated to actual clinician ratings. Scores were also effective in discriminating between case performance judged passing or…
Descriptors: Algorithms, Computer Assisted Testing, Computer Simulation, Evaluators
Peer reviewed Peer reviewed
Clauser, Brian E.; And Others – Journal of Educational Measurement, 1995
A scoring algorithm for performance assessments is described that is based on expert judgments but requires the rating of only a sample of performances. A regression-based policy capturing procedure was implemented for clinicians evaluating skills of 280 medical students. Results demonstrate the usefulness of the algorithm. (SLD)
Descriptors: Algorithms, Clinical Diagnosis, Computer Simulation, Educational Assessment