ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	4

Descriptor

Performance Based Assessment	11
Generalizability Theory	5
Scoring	5
Models	4
Physicians	4
Algorithms	3
Computer Simulation	3
Evaluators	3
Medical Education	3
Medical Students	3
Test Construction	3
Test Scoring Machines	3
Clinical Experience	2
Computer Assisted Testing	2
Evaluation Methods	2
Licensing Examinations…	2
Regression (Statistics)	2
Automation	1
Classification	1
Clinical Diagnosis	1
Comparative Analysis	1
Computer Uses in Education	1
Correlation	1
Criteria	1
Educational Assessment	1
More ▼

Source

Journal of Educational…	6
Applied Measurement in…	2
Applied Psychological…	2
Advances in Health Sciences…	1

Author

Clauser, Brian E.	11
Clyman, Stephen G.	4
Harik, Polina	4
Margolis, Melissa J.	3
Swanson, David B.	3
Nungester, Ronald J.	2
Ross, Linette P.	2
Chang, Lucy	1
El-Bayoumi, Gigi	1
Grabovsky, Irina	1
Kane, Michael T.	1
Keller, Lisa A.	1
Malakoff, Gary L.	1
Nandakumar, Ratna	1
Piemme, Thomas E.	1
Pincetl, Pierre S.	1
Raymond, Mark R.	1
Rose, Kathie M.	1
Swanson, Dave	1
More ▼

Publication Type

Journal Articles	11
Reports - Research	6
Reports - Evaluative	3
Reports - Descriptive	2

Education Level

Higher Education

Audience

Location

United States

Laws, Policies, & Programs

Assessments and Surveys

United States Medical…

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Using Multivariate Generalizability Theory to Assess the Effect of Content Stratification on the Reliability of a Performance Assessment

Peer reviewed

Direct link

Keller, Lisa A.; Clauser, Brian E.; Swanson, David B. – Advances in Health Sciences Education, 2010

In recent years, demand for performance assessments has continued to grow. However, performance assessments are notorious for lower reliability, and in particular, low reliability resulting from task specificity. Since reliability analyses typically treat the performance tasks as randomly sampled from an infinite universe of tasks, these estimates…

Descriptors: Generalizability Theory, Test Reliability, Performance Based Assessment, Error of Measurement

The Impact of Statistically Adjusting for Rater Effects on Conditional Standard Errors of Performance Ratings

Peer reviewed

Direct link

Raymond, Mark R.; Harik, Polina; Clauser, Brian E. – Applied Psychological Measurement, 2011

Prior research indicates that the overall reliability of performance ratings can be improved by using ordinary least squares (OLS) regression to adjust for rater effects. The present investigation extends previous work by evaluating the impact of OLS adjustment on standard errors of measurement ("SEM") at specific score levels. In…

Descriptors: Performance Based Assessment, Licensing Examinations (Professions), Least Squares Statistics, Item Response Theory

An Examination of Rater Drift within a Generalizability Theory Framework

Peer reviewed

Direct link

Harik, Polina; Clauser, Brian E.; Grabovsky, Irina; Nungester, Ronald J.; Swanson, Dave; Nandakumar, Ratna – Journal of Educational Measurement, 2009

The present study examined the long-term usefulness of estimated parameters used to adjust the scores from a performance assessment to account for differences in rater stringency. Ratings from four components of the USMLE[R] Step 2 Clinical Skills Examination data were analyzed. A generalizability-theory framework was used to examine the extent to…

Descriptors: Generalizability Theory, Performance Based Assessment, Performance Tests, Clinical Experience

Recurrent Issues and Recent Advances in Scoring Performance Assessments.

Peer reviewed

Clauser, Brian E. – Applied Psychological Measurement, 2000

Provides a conceptual framework for the development of scoring procedures for performance assessments. The framework considers: (1) aspects of the performance to be scored; (2) criteria to evaluate aspects of the performance; (3) development of scoring criteria; and (4) application of scoring criteria. (SLD)

Descriptors: Criteria, Models, Performance Based Assessment, Scoring

Validity Issues for Performance-Based Tests Scored with Computer-Automated Scoring Systems.

Peer reviewed

Clauser, Brian E.; Kane, Michael T.; Swanson, David B. – Applied Measurement in Education, 2002

Attempts to place the issues associated with computer-automated scoring within the context of current validity theory and presents a taxonomy of automated scoring procedures as a framework for discussing threats to validity that may take on increased importance for specific approaches to automated scoring. (SLD)

Descriptors: Classification, Computer Uses in Education, Performance Based Assessment, Test Construction

Components of Rater Error in a Complex Performance Assessment.

Peer reviewed

Clauser, Brian E.; Clyman, Stephen G.; Swanson, David B. – Journal of Educational Measurement, 1999

Two studies focused on aspects of the rating process in performance assessment. The first, which involved 15 raters and about 400 medical students, made the "committee" facet of raters working in groups explicit, and the second, which involved about 200 medical students and four raters, made the "rating-occasion" facet…

Descriptors: Error Patterns, Evaluation Methods, Evaluators, Higher Education

The Generalizability of Scores for a Performance Assessment Scored with a Computer-Automated Scoring System.

Peer reviewed

Clauser, Brian E.; Harik, Polina; Clyman, Stephen G. – Journal of Educational Measurement, 2000

Used generalizability theory to assess the impact of using independent, randomly equivalent groups of experts to develop scoring algorithms for computer simulation tasks designed to measure physicians' patient management skills. Results with three groups of four medical school faculty members each suggest that the impact of the expert group may be…

Descriptors: Computer Simulation, Generalizability Theory, Performance Based Assessment, Physicians

Development of Automated Scoring Algorithms for Complex Performance Assessments: A Comparison of Two Approaches.

Peer reviewed

Clauser, Brian E.; Margolis, Melissa J.; Clyman, Stephen G.; Ross, Linette P. – Journal of Educational Measurement, 1997

Research on automated scoring is extended by comparing alternative automated systems for scoring a computer simulation of physicians' patient management skills. A regression-based system is more highly correlated with experts' evaluations than a system that uses complex rules to map performances into score levels, but both approaches are feasible.…

Descriptors: Algorithms, Automation, Comparative Analysis, Computer Assisted Testing

A Multivariate Generalizability Analysis of Data from a Performance Assessment of Physicians' Clinical Skills

Peer reviewed

Direct link

Clauser, Brian E.; Harik, Polina; Margolis, Melissa J. – Journal of Educational Measurement, 2006

Although multivariate generalizability theory was developed more than 30 years ago, little published research utilizing this framework exists and most of what does exist examines tests built from tables of specifications. In this context, it is assumed that the universe scores from levels of the fixed multivariate facet will be correlated, but the…

Descriptors: Multivariate Analysis, Job Skills, Correlation, Test Items

Development of a Scoring Algorithm To Replace Expert Rating for Scoring a Complex Performance-Based Assessment.

Peer reviewed

Clauser, Brian E.; Ross, Linette P.; Clyman, Stephen G.; Rose, Kathie M.; Margolis, Melissa J.; Nungester, Ronald J.; Piemme, Thomas E.; Chang, Lucy; El-Bayoumi, Gigi; Malakoff, Gary L.; Pincetl, Pierre S. – Applied Measurement in Education, 1997

Describes an automated scoring algorithm for a computer-based simulation examination of physicians' patient-management skills. Results with 280 medical students show that scores produced using this algorithm are highly correlated to actual clinician ratings. Scores were also effective in discriminating between case performance judged passing or…

Descriptors: Algorithms, Computer Assisted Testing, Computer Simulation, Evaluators

Scoring a Performance-Based Assessment by Modeling the Judgments of Experts.

Peer reviewed

Clauser, Brian E.; And Others – Journal of Educational Measurement, 1995

A scoring algorithm for performance assessments is described that is based on expert judgments but requires the rating of only a sample of performances. A regression-based policy capturing procedure was implemented for clinicians evaluating skills of 280 medical students. Results demonstrate the usefulness of the algorithm. (SLD)

Descriptors: Algorithms, Clinical Diagnosis, Computer Simulation, Educational Assessment