ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	2

Descriptor

Interrater Reliability	7
Scores	7
Test Interpretation	7
Test Reliability	4
Measurement Techniques	3
Scoring	3
Test Validity	3
Difficulty Level	2
Error of Measurement	2
Essay Tests	2
Evaluation Methods	2
Higher Education	2
Performance Based Assessment	2
Test Construction	2
Writing Evaluation	2
Alternative Assessment	1
Analysis of Variance	1
Anatomy	1
Certification	1
Classification	1
College Freshmen	1
Computer Assisted Testing	1
Concept Mapping	1
Correlation	1
Criterion Referenced Tests	1
More ▼

Source

Applied Measurement in…	2
International Journal of…	1
Language Assessment Quarterly	1

Author

Clariana, Roy B.	1
Dunbar, Stephen B.	1
Koul, Ravinder	1
Lunz, Mary E.	1
Rudner, Lawrence M.	1
Salehi, Roya	1
Shale, Doug	1
Sullivan, Francis J.	1
Tengberg, Michael	1

Publication Type

Journal Articles	4
Reports - Evaluative	3
Reports - Research	3
Speeches/Meeting Papers	2
ERIC Digests in Full Text	1
ERIC Publications	1

Education Level

Grade 9	1
Higher Education	1
Postsecondary Education	1

Audience

Researchers

Location

Sweden

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 7 results Save | Export

Validation of Sub-Constructs in Reading Comprehension Tests Using Teachers' Classification of Cognitive Targets

Peer reviewed

Direct link

Tengberg, Michael – Language Assessment Quarterly, 2018

Reading comprehension is often treated as a multidimensional construct. In many reading tests, items are distributed over reading process categories to represent the subskills expected to constitute comprehension. This study explores (a) the extent to which specified subskills of reading comprehension tests are conceptually conceivable to…

Descriptors: Reading Tests, Reading Comprehension, Scores, Test Results

Measuring the Impact of Judge Severity on Examination Scores.

Peer reviewed

Lunz, Mary E.; And Others – Applied Measurement in Education, 1990

An extension of the Rasch model is used to obtain objective measurements for examinations graded by judges. The model calibrates elements of each facet of the examination on a common log-linear scale. Real examination data illustrate the way correcting for judge severity improves fairness of examinee measures. (SLD)

Descriptors: Certification, Difficulty Level, Interrater Reliability, Judges

Essay Reliability: Form and Meaning.

Download full text

Shale, Doug – 1986

This study is an attempt at a cohesive characterization of the concept of essay reliability. As such, it takes as a basic premise that previous and current practices in reporting reliability estimates for essay tests have certain shortcomings. The study provides an analysis of these shortcomings--partly to encourage a fuller understanding of the…

Descriptors: Analysis of Variance, Correlation, Error of Measurement, Essay Tests

The Criterion-Related Validity of a Computer-Based Approach for Scoring Concept Maps

Peer reviewed

Direct link

Clariana, Roy B.; Koul, Ravinder; Salehi, Roya – International Journal of Instructional Media, 2006

This investigation seeks to confirm a computer-based approach that can be used to score concept maps (Poindexter & Clariana, 2004) and then describes the concurrent criterion-related validity of these scores. Participants enrolled in two graduate courses (n=24) were asked to read about and research online the structure and function of the heart…

Descriptors: Semantics, Human Body, Test Validity, Anatomy

Reducing Errors Due to the Use of Judges. ERIC/TM Digest.

Download full text

Rudner, Lawrence M. – 1992

Several common sources of error in assessment that depends on the use of judges are identified, and ways to reduce the impact of rating errors are examined. Numerous threats to the validity of scores based on ratings exist. These threats include: (1) the halo effect; (2) stereotyping; (3) perception differences; (4) leniency/stringency error; and…

Descriptors: Alternative Assessment, Error of Measurement, Evaluation Methods, Evaluators

Quality Control in the Development and Use of Performance Assessments.

Peer reviewed

Dunbar, Stephen B.; And Others – Applied Measurement in Education, 1991

Issues pertaining to the quality of performance assessments, including reliability and validity, are discussed. The relatively limited generalizability of performance across tasks is indicative of the care needed to evaluate performance assessments. Quality control is an empirical matter when measurement is intended to inform public policy. (SLD)

Descriptors: Educational Assessment, Generalization, Interrater Reliability, Measurement Techniques

Placing Texts, Placing Writers: Sources of Readers' Judgments in University Placement-Testing.

Download full text

Sullivan, Francis J. – 1986

A study examined how pragmatic form influences evaluation of student essays in university placement testing. Specifically, the study documented how patterns in students' use of information (assumed to be either old, inferable, or new for readers) affected the holistic scores for quality given to the essays. Subjects, 99 randomly selected entering…

Descriptors: College Freshmen, Essay Tests, Evaluation Criteria, Evaluation Methods