ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	4

Descriptor

Holistic Evaluation	42
Interrater Reliability	42
Writing Evaluation	34
Scoring	23
Essay Tests	16
Higher Education	16
Evaluation Methods	12
Test Reliability	11
Student Evaluation	9
Writing (Composition)	9
Writing Skills	9
Evaluators	6
Test Construction	6
Test Validity	6
Testing Problems	6
College Entrance Examinations	5
Correlation	5
Elementary Secondary Education	5
English (Second Language)	5
Evaluation Criteria	5
Grading	5
Scoring Rubrics	5
Comparative Analysis	4
Computer Assisted Testing	4
Elementary Education	4
More ▼

Source

Applied Measurement in…	1
Assessment & Evaluation in…	1
Canadian Journal of Learning…	1
College Composition and…	1
Early Childhood Research…	1
Journal of Education for…	1
Journal of Educational…	1
Journal of Pan-Pacific…	1
Journal of Research and…	1
Language Assessment Quarterly	1

Publication Type

Reports - Research	28
Speeches/Meeting Papers	19
Journal Articles	10
Reports - Evaluative	7
Tests/Questionnaires	5
Opinion Papers	4
Reports - Descriptive	4
Numerical/Quantitative Data	3
Books	1

Education Level

Higher Education	3
Postsecondary Education	2
Elementary Education	1
Grade 4	1
Grade 5	1
Grade 6	1
Intermediate Grades	1

Audience

Researchers	9
Practitioners	1
Teachers	1

Location

Australia	1
Canada	1
North Carolina	1
Pennsylvania	1
South Korea	1
Sweden	1

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	2
Medical College Admission Test	2
Test of English as a Foreign…	2
Test of Standard Written…	2
General Educational…	1
National Assessment of…	1
National Teacher Examinations	1
SAT (College Admission Test)	1
Student Descriptive…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 42 results Save | Export

Same Grade for Different Reasons, Different Grades for the Same Reason?

Peer reviewed

Direct link

Ilona Rinne – Assessment & Evaluation in Higher Education, 2024

It is widely acknowledged in research that common criteria and aligned standards do not result in consistent assessment of such a complex performance as the final undergraduate thesis. Assessment is determined by examiners' understanding of rubrics and their views on thesis quality. There is still a gap in the research literature about how…

Descriptors: Foreign Countries, Undergraduate Students, Teacher Education Programs, Evaluation Criteria

Examining AI-Based Accuracy Assessment in L2 Learners' Writing

Peer reviewed

Direct link

On-Soon Lee – Journal of Pan-Pacific Association of Applied Linguistics, 2024

Despite the increasing interest in using AI tools as assistant agents in instructional settings, the effectiveness of ChatGPT, the generative pretrained AI, for evaluating the accuracy of second language (L2) writing has been largely unexplored in formative assessment. Therefore, the current study aims to examine how ChatGPT, as an evaluator,…

Descriptors: Foreign Countries, Undergraduate Students, English (Second Language), Second Language Learning

The Reality of Assessing "Authentic" Electronic Portfolios: Can Electronic Portfolios Serve as a Form of Standardized Assessment to Measure Literacy and Self-Regulated Learning at the Elementary Level?

Peer reviewed
PDF on ERIC

Download full text

Bures, Eva Mary; Barclay, Alexandra; Abrami, Philip C.; Meyer, Elizabeth J. – Canadian Journal of Learning and Technology, 2013

This study explores electronic portfolios and their potential to assess student literacy and selfregulated learning in elementary-aged children. Assessment tools were developed and include a holistic rubric that assigns a mark from 1 to 5 to self-regulated learning (SRL) and a mark to literacy, and an analytical rubric measuring multiple…

Descriptors: Portfolio Assessment, Electronic Publishing, Elementary School Students, Literacy

Grading between the Lines: What Really Impacts Professors' Holistic Evaluation of ESL Graduate Student Writing?

Peer reviewed

Direct link

Huang, Jinyan; Foote, Chandra J. – Language Assessment Quarterly, 2010

This study examines score variations and differences in the reliability of ratings between English-as-a-second-language (ESL) and native English (NE) authored papers in a graduate course. Generalizability (G-) theory was used as a framework for analysis because it is powerful in detecting rater variability and the relative contributions of…

Descriptors: Graduate Students, Holistic Evaluation, North Americans, English (Second Language)

Effects of Rating Task Instructions on Consistency and Accuracy of Expert Raters.

Download full text

Littlefield, John H.; Troendle, G. Roger – 1987

The effect of different types of rating task instructions on rater behavior was examined using experts, as opposed to novices, as raters. The experts were instructed to (1) form a global categorical judgment (early hypothesis generation); (2) assess 19 detailed elements; or (3) both. Subjects were 8 dental faculty members who ranged in age from 28…

Descriptors: Dentistry, Evaluation Methods, Higher Education, Holistic Evaluation

Holistic Scoring for Measuring and Promoting Improvement in Writing Skills.

Peer reviewed

Dyer, Jack L.; And Others – Journal of Education for Business, 1994

Holistic scoring enables the evaluation of writing skills based on general impressions of content and style. An experiment in an accounting class shows how it can be applied successfully with a high degree of reliability. (SK)

Descriptors: Accounting, Higher Education, Holistic Evaluation, Interrater Reliability

Relationship of Analytic and Holistic Methods to Raters' Scores for Speeches.

Peer reviewed

Goulden, Nancy Rost – Journal of Research and Development in Education, 1994

Study examined the impact of method on speech ratings of classroom teachers by concurrently testing reliability and validity for the analytic and holistic methods, holding the variables of raters and speeches constant. Student speeches were videotaped and scored. Both methods produced similar and acceptable levels of reliability and concurrent…

Descriptors: College Students, Evaluation Methods, Higher Education, Holistic Evaluation

Improving Interrater Reliability.

Download full text

Atkinson, Dianne; Murray, Mary – 1987

Noting that improvement in rater reliability means eliminating differences among raters, this paper discusses ways to assess writing evaluator reliability and methods for achieving higher levels of interrater reliability. After showing that reliability can be improved two ways--by increasing the number of raters or measurements made, and by…

Descriptors: Evaluation Methods, Holistic Evaluation, Interrater Reliability, Measurement Techniques

The Effect of Several Variables on Judgmentally-Obtained Cut Scores.

Harker, Jill K.; Cope, Ronald T. – 1988

Cut scores obtained for licensure tests using different judgmental methods of standard setting (holistic, test blueprint, Angoff, and modified Angoff) were compared. Nineteen educators and practitioners participated in this study as judges. Pre- and post-test feedback (feedback of total- and low-group item p-value) ratings were obtained under the…

Descriptors: Cutting Scores, Feedback, Holistic Evaluation, Interrater Reliability

The Selection and Use of Sample Papers in Holistic Evaluation.

PDF pending restoration

Daiker, Donald A.; Grogan, Nedra – 1985

The role of sample papers (i.e., anchor papers, prototypes, range-finders) in holistic evaluation of writing is discussed. When, where, and how many sample papers are to be selected, and who should perform the selection are covered. The process of sample selection should proceed as follows: (1) a general reading of papers by committee members to…

Descriptors: Advanced Placement, Essay Tests, Evaluators, Higher Education

Measuring the Organizational Aspects of Writing Ability.

Peer reviewed

Benton, Stephen L.; Kiewra, Kenneth A. – Journal of Educational Measurement, 1986

This paper assessed the relationships among holistic writing ability, the Test of Standard Written English, and four tests of organizational ability. Findings showed a significant correlation between writing ability and the tests. It was concluded that tests assessing organizational strategies ought to be included in assessments of writing…

Descriptors: Correlation, Essay Tests, Higher Education, Holistic Evaluation

Reliability, Validity, and Holistic Scoring: What We Know and What We Need to Know.

Peer reviewed

Huot, Brian – College Composition and Communication, 1990

Describes holistic scoring as one of the biggest breakthroughs in writing assessment. Suggests that the technique's high interrater reliability coefficients partly explain holistic scoring's popularity. Argues that validity has been largely neglected. Concludes that more must be learned about the uses and effects of holistic scoring. (SG)

Descriptors: Educational Testing, Higher Education, Holistic Approach, Holistic Evaluation

The Potential Dual Effect of Context Effects and Score Level Effects on the Assignment of Scores to Essays.

Download full text

Paden, Patricia A. – 1986

Two factors which may affect the ratings assigned to an essay test are investigated: (1) context effects; and (2) score level effects. Context effects exist in essay scoring if an essay is rated higher when preceded by poor quality essays than when preceded by high quality essays. A score level effect is defined as a change in the score (value)…

Descriptors: Context Effect, Essay Tests, Holistic Evaluation, Interrater Reliability

The Measurement of Writing Ability with a Many-Faceted Rasch Model.

Download full text

Engelhard, George, Jr. – 1991

A many-faceted Rasch model (FACETS) is presented for the measurement of writing ability. The FACETS model is a multivariate extension of Rasch measurement models that can be used to provide a framework for calibrating both raters and writing tasks within the context of writing assessment. A FACETS model is described based on the current procedures…

Descriptors: Grade 8, Holistic Evaluation, Interrater Reliability, Item Response Theory

More than a Decade's Highlight? The Holistic Scoring Consensus and the Need for Change.

Gregory, Kemp – 1991

A balanced appraisal of holistic scoring of writing is presented via: examination of the present popularity of holistic scoring; analysis of several weaknesses associated with the holistic scoring method; and recommendations for remedying these weaknesses. Six reasons for the popularity of holistic scoring are: (1) relative lack of expense; (2)…

Descriptors: Child Development, Cost Effectiveness, Elementary Secondary Education, Holistic Evaluation

Previous Page | Next Page »

Pages: 1 | 2 | 3

Anderson, Judith A.	2
Carlson, Sybil B.	2
Mitchell, Karen J.	2
Abrami, Philip C.	1
Aghbar, Ali-Asghar	1
Atkinson, Dianne	1
Auchter, Joan Chikos	1
Barclay, Alexandra	1
Bell, Robert M.	1
Benton, Stephen L.	1
Braungart-Bloom, Diane S.	1
Breland, Hunter M.	1
Bures, Eva Mary	1
Burry, James	1
Busch, John Christian	1
Busch, Katharine Mitchell	1
Camp, Roberta	1
Comfort, Kathy	1
Cope, Ronald T.	1
Cross, James Logan	1
Daiker, Donald A.	1
De Ayala, R. J.	1
Dyer, Jack L.	1
Engelhard, George, Jr.	1
More ▼