Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 5 |
Descriptor
Writing Evaluation | 9 |
Item Response Theory | 6 |
Interrater Reliability | 4 |
Writing Tests | 4 |
Grade 8 | 3 |
Junior High Schools | 3 |
Measurement Techniques | 3 |
Scores | 3 |
Writing (Composition) | 3 |
Accuracy | 2 |
Educational Assessment | 2 |
More ▼ |
Source
Applied Measurement in… | 1 |
Assessing Writing | 1 |
College Board | 1 |
Educational and Psychological… | 1 |
International Journal of… | 1 |
Journal of Educational… | 1 |
Author
Engelhard, George, Jr. | 9 |
Wind, Stefanie A. | 2 |
Wolfe, Edward W. | 2 |
Behizadeh, Nadia | 1 |
Chajewski, Michael | 1 |
Cohen, Allan S. | 1 |
Foltz, Peter | 1 |
Kobrin, Jennifer L. | 1 |
Lu, Zhenqiu | 1 |
Raczynski, Kevin R. | 1 |
Rosenstein, Mark | 1 |
More ▼ |
Publication Type
Reports - Research | 6 |
Journal Articles | 5 |
Speeches/Meeting Papers | 4 |
Reports - Evaluative | 3 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 1 |
Postsecondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Wind, Stefanie A.; Wolfe, Edward W.; Engelhard, George, Jr.; Foltz, Peter; Rosenstein, Mark – International Journal of Testing, 2018
Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the…
Descriptors: Computer Assisted Testing, Essay Tests, Writing Evaluation, Scoring
Wang, Jue; Engelhard, George, Jr.; Wolfe, Edward W. – Educational and Psychological Measurement, 2016
The number of performance assessments continues to increase around the world, and it is important to explore new methods for evaluating the quality of ratings obtained from raters. This study describes an unfolding model for examining rater accuracy. Accuracy is defined as the difference between observed and expert ratings. Dichotomous accuracy…
Descriptors: Evaluators, Accuracy, Performance Based Assessment, Models
Raczynski, Kevin R.; Cohen, Allan S.; Engelhard, George, Jr.; Lu, Zhenqiu – Journal of Educational Measurement, 2015
There is a large body of research on the effectiveness of rater training methods in the industrial and organizational psychology literature. Less has been reported in the measurement literature on large-scale writing assessments. This study compared the effectiveness of two widely used rater training methods--self-paced and collaborative…
Descriptors: Interrater Reliability, Writing Evaluation, Training Methods, Pacing
Engelhard, George, Jr.; Wind, Stefanie A.; Kobrin, Jennifer L.; Chajewski, Michael – College Board, 2013
The purpose of this study is to illustrate the use of explanatory models based on Rasch measurement theory to detect systematic relationships between student and item characteristics and achievement differences using differential item functioning (DIF), differential group functioning (DGF), and differential person functioning (DPF) techniques. The…
Descriptors: Test Bias, Evaluation Methods, Measurement Techniques, Writing Evaluation
Behizadeh, Nadia; Engelhard, George, Jr. – Assessing Writing, 2011
The purpose of this study is to examine the interactions among measurement theories, writing theories, and writing assessments in the United States from an historical perspective. The assessment of writing provides a useful framework for examining how theories influence, and in some cases fail to influence actual practice. Two research traditions…
Descriptors: Writing (Composition), Intellectual Disciplines, Writing Evaluation, Writing Tests
Engelhard, George, Jr.; And Others – 1994
A set of procedures is described for constructing an assessment network composed of a connected system of rater and writing task banks within the context of large-scale assessments of written composition. The calibration of the assessment tasks and the measurement of individuals are viewed as separate, although complementary, activities. The…
Descriptors: Data Collection, Educational Assessment, Interrater Reliability, Item Banks
Engelhard, George, Jr. – 1991
A many-faceted Rasch model (FACETS) is presented for the measurement of writing ability. The FACETS model is a multivariate extension of Rasch measurement models that can be used to provide a framework for calibrating both raters and writing tasks within the context of writing assessment. A FACETS model is described based on the current procedures…
Descriptors: Grade 8, Holistic Evaluation, Interrater Reliability, Item Response Theory
Engelhard, George, Jr.; And Others – 1991
A study examined the influence of mode of discourse, experiential demand and gender on quality of student writing. All of the eighth-grade students (125,756) who participated in a statewide assessment of writing during spring 1989 and spring 1990 were included in the study. Eighteen writing tasks were administered during these two years. The…
Descriptors: Discourse Modes, Educational Diagnosis, Grade 8, Junior High Schools

Engelhard, George, Jr. – Applied Measurement in Education, 1992
A Many-Faceted Rasch Model (FACETS) for measurement of writing ability is described, and its use in solving measurement problems in large-scale assessment is illustrated with a random sample of 1,000 students from Georgia's Eighth Grade Writing Test. It is a promising approach to assessment through written compositions. (SLD)
Descriptors: Educational Assessment, Essays, Evaluation Problems, Grade 8