Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 5 |
Descriptor
Source
Educational Measurement:… | 3 |
Journal of Educational… | 2 |
Advances in Health Sciences… | 1 |
Applied Measurement in… | 1 |
Applied Psychological… | 1 |
Practical Assessment,… | 1 |
Author
Publication Type
Journal Articles | 9 |
Reports - Research | 6 |
Reports - Evaluative | 3 |
Speeches/Meeting Papers | 2 |
Guides - Non-Classroom | 1 |
Information Analyses | 1 |
Reports - General | 1 |
Education Level
Higher Education | 2 |
Audience
Researchers | 1 |
Location
Laws, Policies, & Programs
Assessments and Surveys
United States Medical… | 2 |
What Works Clearinghouse Rating
Feinberg, Richard A.; Raymond, Mark R.; Haist, Steven A. – Educational Measurement: Issues and Practice, 2015
To mitigate security concerns and unfair score gains, credentialing programs routinely administer new test material to examinees retesting after an initial failing attempt. Counterintuitively, a small but growing body of recent research suggests that repeating the identical form does not create an unfair advantage. This study builds upon and…
Descriptors: Licensing Examinations (Professions), Repetition, Testing, Responses
Raymond, Mark R.; Swygert, Kimberly A.; Kahraman, Nilufer – Journal of Educational Measurement, 2012
Although a few studies report sizable score gains for examinees who repeat performance-based assessments, research has not yet addressed the reliability and validity of inferences based on ratings of repeat examinees on such tests. This study analyzed scores for 8,457 single-take examinees and 4,030 repeat examinees who completed a 6-hour clinical…
Descriptors: Physicians, Licensing Examinations (Professions), Performance Based Assessment, Repetition
Stoffel, Heather; Raymond, Mark R.; Bucak, S. Deniz; Haist, Steven A. – Practical Assessment, Research & Evaluation, 2014
Previous research on the impact of text and formatting changes on test-item performance has produced mixed results. This matter is important because it is generally acknowledged that "any" change to an item requires that it be recalibrated. The present study investigated the effects of seven classes of stylistic changes on item…
Descriptors: Test Construction, Test Items, Standardized Tests, Physicians
Raymond, Mark R.; Swygert, Kimberly A.; Kahraman, Nilufer – Advances in Health Sciences Education, 2012
Examinees who initially fail and later repeat an SP-based clinical skills exam typically exhibit large score gains on their second attempt, suggesting the possibility that examinees were not well measured on one of those attempts. This study evaluates score precision for examinees who repeated an SP-based clinical skills test administered as part…
Descriptors: Evidence, Generalizability Theory, Error of Measurement, Clinical Experience
Raymond, Mark R.; Harik, Polina; Clauser, Brian E. – Applied Psychological Measurement, 2011
Prior research indicates that the overall reliability of performance ratings can be improved by using ordinary least squares (OLS) regression to adjust for rater effects. The present investigation extends previous work by evaluating the impact of OLS adjustment on standard errors of measurement ("SEM") at specific score levels. In…
Descriptors: Performance Based Assessment, Licensing Examinations (Professions), Least Squares Statistics, Item Response Theory

Raymond, Mark R. – Educational Measurement: Issues and Practice, 2002
Offers recommendations for the conduct of practice analysis (i.e., job analysis) concerning these issues: (1) selecting a method of practice analysis; (2) developing rating scales; (3) determining the content of test plans; (4) using multi-variate procedures for structuring test plans; and (5) determining topic weights for test plans. (SLD)
Descriptors: Certification, Credentials, Evaluation Methods, Job Analysis

Raymond, Mark R. – Applied Measurement in Education, 2001
Reviews general approaches to job analysis and considers methodological issues related to sampling and the development of rating scales used to measure and describe a profession or occupation. Evaluates the usefulness of different types of test plans and describes judgmental and empirical methods for using practice analysis data to help develop…
Descriptors: Certification, Job Analysis, Licensing Examinations (Professions), Rating Scales
Raymond, Mark R. – Educational Measurement: Issues and Practice, 2005
The purpose of a credentialing examination is to assure the public that individuals who work in an occupation or profession have met certain standards. To be consistent with this purpose, credentialing examinations must be job related, and this requirement is typically met by developing test plans based on an empirical job or practice analysis.…
Descriptors: Questionnaires, Guidelines, Task Analysis, Licensing Examinations (Professions)
Raymond, Mark R.; Viswesvaran, Chockalingam – 1991
This study illustrates the use of three least-squares models to control for rater effects in performance evaluation: (1) ordinary least squares (OLS); (2) weighted least squares (WLS); and (3) OLS subsequent to applying a logistic transformation to observed ratings (LOG-OLS). The three models were applied to ratings obtained from four…
Descriptors: Evaluators, Higher Education, Interrater Reliability, Least Squares Statistics
Fabrey, Lawrence J.; Raymond, Mark R. – 1987
The American Nurses' Association certification provides professional recognition beyond licensure to nurses who pass an examination. To determine the passing score as it would be set by a representative peer group, a survey was mailed to a random sample of 200 recently certified nurses. Three questions were asked: (1) what percentage of examinees…
Descriptors: Adults, Certification, Cutting Scores, Judgment Analysis Technique
Raymond, Mark R. – 1995
This paper reviews and evaluates methods for conducting job analysis. The paper begins with a discussion of the purpose of licensure and certification and the dimensions of job analysis. The following four dimensions along which job analysis methods vary are discussed: types of job descriptors; sources of information; data collection methods; and…
Descriptors: Certification, Credentials, Data Analysis, Data Collection

Raymond, Mark R.; Viswesvaran, Chockalingam – Journal of Educational Measurement, 1993
Three variations of a least squares regression model are presented that are suitable for determining and correcting for rating error in designs in which examinees are evaluated by a subset of possible raters. Models are applied to ratings from 4 administrations of a medical certification examination in which 40 raters and approximately 115…
Descriptors: Error of Measurement, Evaluation Methods, Higher Education, Interrater Reliability