ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	5

Descriptor

Writing Evaluation	9
Item Response Theory	6
Interrater Reliability	4
Writing Tests	4
Grade 8	3
Junior High Schools	3
Measurement Techniques	3
Scores	3
Writing (Composition)	3
Accuracy	2
Educational Assessment	2
Essays	2
Evaluation Methods	2
Evaluators	2
Models	2
Multivariate Analysis	2
Psychometrics	2
Student Evaluation	2
Test Bias	2
Writing Ability	2
College Entrance Examinations	1
Computer Assisted Testing	1
Correlation	1
Data Analysis	1
Data Collection	1
More ▼

Source

Applied Measurement in…	1
Assessing Writing	1
College Board	1
Educational and Psychological…	1
International Journal of…	1
Journal of Educational…	1

Author

Engelhard, George, Jr.	9
Wind, Stefanie A.	2
Wolfe, Edward W.	2
Behizadeh, Nadia	1
Chajewski, Michael	1
Cohen, Allan S.	1
Foltz, Peter	1
Kobrin, Jennifer L.	1
Lu, Zhenqiu	1
Raczynski, Kevin R.	1
Rosenstein, Mark	1
Wang, Jue	1
More ▼

Publication Type

Reports - Research	6
Journal Articles	5
Speeches/Meeting Papers	4
Reports - Evaluative	3
Tests/Questionnaires	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)

What Works Clearinghouse Rating

Showing all 9 results Save | Export

The Influence of Rater Effects in Training Sets on the Psychometric Quality of Automated Scoring for Writing Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Wolfe, Edward W.; Engelhard, George, Jr.; Foltz, Peter; Rosenstein, Mark – International Journal of Testing, 2018

Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the…

Descriptors: Computer Assisted Testing, Essay Tests, Writing Evaluation, Scoring

Evaluating Rater Accuracy in Rater-Mediated Assessments Using an Unfolding Model

Peer reviewed

Direct link

Wang, Jue; Engelhard, George, Jr.; Wolfe, Edward W. – Educational and Psychological Measurement, 2016

The number of performance assessments continues to increase around the world, and it is important to explore new methods for evaluating the quality of ratings obtained from raters. This study describes an unfolding model for examining rater accuracy. Accuracy is defined as the difference between observed and expert ratings. Dichotomous accuracy…

Descriptors: Evaluators, Accuracy, Performance Based Assessment, Models

Comparing the Effectiveness of Self-Paced and Collaborative Frame-of-Reference Training on Rater Accuracy in a Large-Scale Writing Assessment

Peer reviewed

Direct link

Raczynski, Kevin R.; Cohen, Allan S.; Engelhard, George, Jr.; Lu, Zhenqiu – Journal of Educational Measurement, 2015

There is a large body of research on the effectiveness of rater training methods in the industrial and organizational psychology literature. Less has been reported in the measurement literature on large-scale writing assessments. This study compared the effectiveness of two widely used rater training methods--self-paced and collaborative…

Descriptors: Interrater Reliability, Writing Evaluation, Training Methods, Pacing

Differential Item and Person Functioning in Large-Scale Writing Assessments within the Context of the SAT®. Research Report 2013-6

Download full text

Engelhard, George, Jr.; Wind, Stefanie A.; Kobrin, Jennifer L.; Chajewski, Michael – College Board, 2013

The purpose of this study is to illustrate the use of explanatory models based on Rasch measurement theory to detect systematic relationships between student and item characteristics and achievement differences using differential item functioning (DIF), differential group functioning (DGF), and differential person functioning (DPF) techniques. The…

Descriptors: Test Bias, Evaluation Methods, Measurement Techniques, Writing Evaluation

Historical View of the Influences of Measurement and Writing Theories on the Practice of Writing Assessment in the United States

Peer reviewed

Direct link

Behizadeh, Nadia; Engelhard, George, Jr. – Assessing Writing, 2011

The purpose of this study is to examine the interactions among measurement theories, writing theories, and writing assessments in the United States from an historical perspective. The assessment of writing provides a useful framework for examining how theories influence, and in some cases fail to influence actual practice. Two research traditions…

Descriptors: Writing (Composition), Intellectual Disciplines, Writing Evaluation, Writing Tests

Constructing Rater and Writing Task Banks for the Assessment of Written Composition.

Download full text

Engelhard, George, Jr.; And Others – 1994

A set of procedures is described for constructing an assessment network composed of a connected system of rater and writing task banks within the context of large-scale assessments of written composition. The calibration of the assessment tasks and the measurement of individuals are viewed as separate, although complementary, activities. The…

Descriptors: Data Collection, Educational Assessment, Interrater Reliability, Item Banks

The Measurement of Writing Ability with a Many-Faceted Rasch Model.

Download full text

Engelhard, George, Jr. – 1991

A many-faceted Rasch model (FACETS) is presented for the measurement of writing ability. The FACETS model is a multivariate extension of Rasch measurement models that can be used to provide a framework for calibrating both raters and writing tasks within the context of writing assessment. A FACETS model is described based on the current procedures…

Descriptors: Grade 8, Holistic Evaluation, Interrater Reliability, Item Response Theory

Writing Tasks and the Quality of Student Writing: Evidence from a Statewide Assessment of Writing.

Download full text

Engelhard, George, Jr.; And Others – 1991

A study examined the influence of mode of discourse, experiential demand and gender on quality of student writing. All of the eighth-grade students (125,756) who participated in a statewide assessment of writing during spring 1989 and spring 1990 were included in the study. Eighteen writing tasks were administered during these two years. The…

Descriptors: Discourse Modes, Educational Diagnosis, Grade 8, Junior High Schools

The Measurement of Writing Ability with a Many-Faceted Rasch Model.

Peer reviewed

Engelhard, George, Jr. – Applied Measurement in Education, 1992

A Many-Faceted Rasch Model (FACETS) for measurement of writing ability is described, and its use in solving measurement problems in large-scale assessment is illustrated with a random sample of 1,000 students from Georgia's Eighth Grade Writing Test. It is a promising approach to assessment through written compositions. (SLD)

Descriptors: Educational Assessment, Essays, Evaluation Problems, Grade 8