Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 7 |
Descriptor
Licensing Examinations… | 7 |
Standard Setting (Scoring) | 4 |
Correlation | 3 |
Data | 3 |
Judges | 3 |
Cutting Scores | 2 |
Generalizability Theory | 2 |
Performance Based Assessment | 2 |
Psychometrics | 2 |
Reliability | 2 |
Validity | 2 |
More ▼ |
Source
Journal of Educational… | 3 |
Applied Measurement in… | 1 |
Applied Psychological… | 1 |
Educational Measurement:… | 1 |
International Journal of… | 1 |
Author
Clauser, Brian E. | 7 |
Margolis, Melissa J. | 5 |
Harik, Polina | 3 |
Mee, Janet | 2 |
Baldwin, Peter | 1 |
Baldwin, Su G. | 1 |
Bucak, Deniz | 1 |
Clauser, Jerome C. | 1 |
Dillon, Gerard F. | 1 |
Grabovsky, Irina | 1 |
Haist, Steven | 1 |
More ▼ |
Publication Type
Journal Articles | 7 |
Reports - Research | 7 |
Education Level
Higher Education | 1 |
Audience
Location
United States | 1 |
Laws, Policies, & Programs
Assessments and Surveys
United States Medical… | 3 |
What Works Clearinghouse Rating
Harik, Polina; Clauser, Brian E.; Grabovsky, Irina; Baldwin, Peter; Margolis, Melissa J.; Bucak, Deniz; Jodoin, Michael; Walsh, William; Haist, Steven – Journal of Educational Measurement, 2018
Test administrators are appropriately concerned about the potential for time constraints to impact the validity of score interpretations; psychometric efforts to evaluate the impact of speededness date back more than half a century. The widespread move to computerized test delivery has led to the development of new approaches to evaluating how…
Descriptors: Comparative Analysis, Observation, Medical Education, Licensing Examinations (Professions)
Margolis, Melissa J.; Clauser, Brian E. – Educational Measurement: Issues and Practice, 2014
This research evaluated the impact of a common modification to Angoff standard-setting exercises: the provision of examinee performance data. Data from 18 independent standard-setting panels across three different medical licensing examinations were examined to investigate whether and how the provision of performance information impacted judgments…
Descriptors: Cutting Scores, Standard Setting (Scoring), Data, Licensing Examinations (Professions)
Clauser, Jerome C.; Clauser, Brian E.; Hambleton, Ronald K. – Applied Measurement in Education, 2014
The purpose of the present study was to extend past work with the Angoff method for setting standards by examining judgments at the judge level rather than the panel level. The focus was on investigating the relationship between observed Angoff standard setting judgments and empirical conditional probabilities. This relationship has been used as a…
Descriptors: Standard Setting (Scoring), Validity, Reliability, Correlation
Clauser, Brian E.; Mee, Janet; Margolis, Melissa J. – International Journal of Testing, 2013
This study investigated the extent to which the performance data format impacted data use in Angoff standard setting exercises. Judges from two standard settings (a total of five panels) were randomly assigned to one of two groups. The full-data group received two types of data: (1) the proportion of examinees selecting each option and (2) plots…
Descriptors: Standard Setting (Scoring), Cutting Scores, Validity, Reliability
Raymond, Mark R.; Harik, Polina; Clauser, Brian E. – Applied Psychological Measurement, 2011
Prior research indicates that the overall reliability of performance ratings can be improved by using ordinary least squares (OLS) regression to adjust for rater effects. The present investigation extends previous work by evaluating the impact of OLS adjustment on standard errors of measurement ("SEM") at specific score levels. In…
Descriptors: Performance Based Assessment, Licensing Examinations (Professions), Least Squares Statistics, Item Response Theory
Clauser, Brian E.; Mee, Janet; Baldwin, Su G.; Margolis, Melissa J.; Dillon, Gerard F. – Journal of Educational Measurement, 2009
Although the Angoff procedure is among the most widely used standard setting procedures for tests comprising multiple-choice items, research has shown that subject matter experts have considerable difficulty accurately making the required judgments in the absence of examinee performance data. Some authors have viewed the need to provide…
Descriptors: Standard Setting (Scoring), Program Effectiveness, Expertise, Health Personnel
Clauser, Brian E.; Harik, Polina; Margolis, Melissa J. – Journal of Educational Measurement, 2006
Although multivariate generalizability theory was developed more than 30 years ago, little published research utilizing this framework exists and most of what does exist examines tests built from tables of specifications. In this context, it is assumed that the universe scores from levels of the fixed multivariate facet will be correlated, but the…
Descriptors: Multivariate Analysis, Job Skills, Correlation, Test Items