ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	5

Descriptor

Error of Measurement	12
Examiners	12
Scoring	5
Graduate Students	4
Intelligence Tests	4
Testing Problems	4
Error Patterns	3
Interrater Reliability	3
Test Interpretation	3
Evaluation Methods	2
Experimenter Characteristics	2
Higher Education	2
Item Response Theory	2
Performance Based Assessment	2
Psychometrics	2
Rating Scales	2
Achievement Tests	1
Adults	1
Bias	1
Children	1
Clinical Psychology	1
College Graduates	1
Communication Skills	1
Conflict Resolution	1
Correlation	1
More ▼

Source

Psychology in the Schools	4
Advances in Health Sciences…	2
Educational Assessment	1
English Language Teaching	1
Evaluation and the Health…	1
Journal of School Psychology	1
ProQuest LLC	1

Publication Type

Journal Articles	10
Reports - Research	9
Reports - Evaluative	2
Dissertations/Theses -…	1
ERIC Digests in Full Text	1
ERIC Publications	1

Education Level

Higher Education	3
Postsecondary Education	2

Audience

Location

United Kingdom

Laws, Policies, & Programs

Assessments and Surveys

Wechsler Intelligence Scale…	5
Wechsler Adult Intelligence…	1
Wide Range Achievement Test	1

What Works Clearinghouse Rating

Showing all 12 results Save | Export

The Effect of Student Examiner Errors on WAIS-IV and WISC-V Composite Scores

Direct link

Atehortua, Laura – ProQuest LLC, 2022

Intelligence tests are used in a variety of settings such as schools, clinics, and courts to assess the intellectual capacity of individuals of all ages. Intelligence tests are used to make high-stakes decisions such as special education placement, employment, eligibility for social security services, and determination of the death penalty.…

Descriptors: Adults, Intelligence Tests, Children, Error of Measurement

Resolving and Re-Scoring Constructed Response Items in Mixed-Format Assessments: An Exploration of Three Approaches

Peer reviewed

Direct link

Stefanie A. Wind; Yangmeng Xu – Educational Assessment, 2024

We explored three approaches to resolving or re-scoring constructed-response items in mixed-format assessments: rater agreement, person fit, and targeted double scoring (TDS). We used a simulation study to consider how the three approaches impact the psychometric properties of student achievement estimates, with an emphasis on person fit. We found…

Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Examiners

Re-Conceptualising and Accounting for Examiner (Cut-Score) Stringency in a 'High Frequency, Small Cohort' Performance Test

Peer reviewed

Direct link

Homer, Matt – Advances in Health Sciences Education, 2021

Variation in examiner stringency is an ongoing problem in many performance settings such as in OSCEs, and usually is conceptualised and measured based on scores/grades examiners award. Under borderline regression, the standard within a station is set using checklist/domain scores and global grades acting in combination. This complexity requires a…

Descriptors: Examiners, Experimenter Characteristics, Cutting Scores, Performance Based Assessment

On Rater Agreement and Rater Training

Peer reviewed
PDF on ERIC

Download full text

Wang, Binhong – English Language Teaching, 2010

This paper first analyzed two studies on rater factors and rating criteria to raise the problem of rater agreement. After that the author reveals the causes of discrepencies in rating administration by discussing rater variability and rater bias. The author argues that rater bias can not be eliminated completely, we can only reduce the error to a…

Descriptors: Interrater Reliability, Examiners, Training, Bias

Undesired Variance Due to Examiner Stringency/Leniency Effect in Communication Skill Scores Assessed in OSCEs

Peer reviewed

Direct link

Harasym, Peter H.; Woloschuk, Wayne; Cunning, Leslie – Advances in Health Sciences Education, 2008

Physician-patient communication is a clinical skill that can be learned and has a positive impact on patient satisfaction and health outcomes. A concerted effort at all medical schools is now directed at teaching and evaluating this core skill. Student communication skills are often assessed by an Objective Structure Clinical Examination (OSCE).…

Descriptors: Medical Schools, Family Practice (Medicine), Examiners, Error of Measurement

Practitioners' Administration and Scoring of the WISC-R: Evidence That We Do Err.

Peer reviewed

Slate, John R.; And Others – Journal of School Psychology, 1992

Analyzed 56 Wechsler Intelligence Scale for Children-Revised protocols completed by 1 certified and 8 licensed practitioners to examine administration and scoring mistakes. Observed numerous mistakes (failure to record examinee responses, assigning too few or too many points to answers, inappropriate questioning, and failure to obtain correct…

Descriptors: Error of Measurement, Error Patterns, Examiners, Intelligence Tests

Common WISC-III Examiner Errors: Evidence from Graduate Students in Training.

Peer reviewed

Alfonso, Vincent C.; Johnson, Annemarie; Patinella, Lilia; Rader, Damon E. – Psychology in the Schools, 1998

Examined 60 Wechsler Intelligence Scale for Children-Third Education (WISC-III) protocols administered by graduate students in training to obtain preliminary data on the frequency and types of administration and scoring errors that examiners commit. The five most frequent errors included failure to query, failure to record response verbatim,…

Descriptors: Counselor Training, Error of Measurement, Examiners, Females

Examiner Errors on the WRAT-R.

Peer reviewed

Peterson, Daniel; And Others – Psychology in the Schools, 1991

Analyzed for examiner errors 55 Wide Range Achievement Test-Revised (WRAT-R) protocols completed by 9 practitioners for metropolitan school district. All practitioners made errors, which occurred on 95 percent of protocols and averaged 3.0 errors per protocol. Most frequent errors included failures to obtain correct ceiling or basal, and failures…

Descriptors: Achievement Tests, Educational Diagnosis, Elementary Secondary Education, Error of Measurement

Five Common Misuses of Tests. ERIC Digest No. 108.

Download full text

Gardner, Eric – 1989

Five of the common misuses of tests are reviewed: (1) acceptance of the test title as an accurate and complete description of the variable being measured (failure to examine the manual and the items carefully to know the specific aspects to be tested can result in misuse through selection of an inappropriate test for a particular purpose or…

Descriptors: Error of Measurement, Evaluation Problems, Examiners, Scoring

WISC-R Examiner Errors: Cause for Concern.

Peer reviewed

Slate, John R.; Chick, David – Psychology in the Schools, 1989

Clinical psychology graduate students (N=14) administered Wechsler Intelligence Scale for Children-Revised. Found numerous scoring and mechanical errors that influenced full-scale intelligence quotient scores on two-thirds of protocols. Particularly prone to error were Verbal subtests of Vocabulary, Comprehension, and Similarities. Noted specific…

Descriptors: Clinical Psychology, Error of Measurement, Examiners, Graduate Students

The Effects of Experience and Structured Feedback on WISC-R Error Rates Made by Student-Examiners.

Peer reviewed

Conner, Robert; Woodall, Fred E. – Psychology in the Schools, 1983

Studied the effects of experience in administration and scoring of the Wechsler Intelligence Scale for Children (Revised) on types of examiner errors. Results showed total score and administrative error rates dropped significantly with experience and feedback, but response scoring errors, mathematical errors, and IQ errors were not reduced…

Descriptors: Error of Measurement, Examiners, Experience, Experimenter Characteristics

Interrater Reliability Reconsidered: Performance Assessment Using One Examiner per Candidate.

Peer reviewed

Gross, Leon J. – Evaluation and the Health Professions, 1994

Whether adequate levels of interrater reliability could be obtained on a national, standardized examination using one examiner per observation was studied with 101 paired candidate observations on an examination for optometry. Results indicate that psychometrically sound judgments can be obtained with one examiner. (SLD)

Descriptors: Educational Assessment, Error of Measurement, Evaluation Methods, Evaluators

Slate, John R.	2
Alfonso, Vincent C.	1
Atehortua, Laura	1
Chick, David	1
Conner, Robert	1
Cunning, Leslie	1
Gardner, Eric	1
Gross, Leon J.	1
Harasym, Peter H.	1
Homer, Matt	1
Johnson, Annemarie	1
Patinella, Lilia	1
Peterson, Daniel	1
Rader, Damon E.	1
Stefanie A. Wind	1
Wang, Binhong	1
Woloschuk, Wayne	1
Woodall, Fred E.	1
Yangmeng Xu	1
More ▼