ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	9

Descriptor

Essay Tests	20
Scoring	16
Essays	15
Writing Evaluation	11
Writing Skills	9
Higher Education	8
Correlation	7
Interrater Reliability	7
Scores	6
Models	5
Multiple Choice Tests	5
Writing (Composition)	5
Evaluators	4
Foreign Countries	4
Test Validity	4
College Students	3
Comparative Analysis	3
Grading	3
Handwriting	3
Prediction	3
Statistical Analysis	3
Test Construction	3
Test Reliability	3
Validity	3
Academic Achievement	2
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	29
Reports - Research	23
Reports - Evaluative	5
Speeches/Meeting Papers	2
Reports - Descriptive	1

Education Level

Elementary Education	1
Elementary Secondary Education	1
Higher Education	1
Postsecondary Education	1

Audience

Researchers

Location

Africa	1
China	1
Georgia	1
New Zealand	1
United Kingdom (England)	1

Laws, Policies, & Programs

Assessments and Surveys

Test of Standard Written…	2
Advanced Placement…	1
Flesch Reading Ease Formula	1
Graduate Record Examinations	1
Metropolitan Achievement Tests	1
National Teacher Examinations	1

What Works Clearinghouse Rating

Journal of Educational Measurement X

Showing 1 to 15 of 33 results Save | Export

Validity Arguments for AI-Based Automated Scores: Essay Scoring as an Illustration

Peer reviewed

Direct link

Ferrara, Steve; Qunbar, Saed – Journal of Educational Measurement, 2022

In this article, we argue that automated scoring engines should be transparent and construct relevant--that is, as much as is currently feasible. Many current automated scoring engines cannot achieve high degrees of scoring accuracy without allowing in some features that may not be easily explained and understood and may not be obviously and…

Descriptors: Artificial Intelligence, Scoring, Essays, Automation

Anchoring Validity Evidence for Automated Essay Scoring

Peer reviewed

Direct link

Shermis, Mark D. – Journal of Educational Measurement, 2022

One of the challenges of discussing validity arguments for machine scoring of essays centers on the absence of a commonly held definition and theory of good writing. At best, the algorithms attempt to measure select attributes of writing and calibrate them against human ratings with the goal of accurate prediction of scores for new essays.…

Descriptors: Scoring, Essays, Validity, Writing Evaluation

A Two-Stage Method for Classroom Assessments of Essay Writing

Peer reviewed

Direct link

Humphry, Stephen Mark; Heldsinger, Sandy – Journal of Educational Measurement, 2019

To capitalize on professional expertise in educational assessment, it is desirable to develop and test methods of rater-mediated assessment that enable classroom teachers to make reliable and informative judgments. Accordingly, this article investigates the reliability of a two-stage method used by classroom teachers to assess primary school…

Descriptors: Essays, Elementary School Students, Writing (Composition), Writing Evaluation

Modeling Basic Writing Processes from Keystroke Logs

Peer reviewed

Direct link

Guo, Hongwen; Deane, Paul D.; van Rijn, Peter W.; Zhang, Mo; Bennett, Randy E. – Journal of Educational Measurement, 2018

The goal of this study is to model pauses extracted from writing keystroke logs as a way of characterizing the processes students use in essay composition. Low-level timing data were modeled, the interkey interval and its subtype, the intraword duration, thought to reflect processes associated with keyboarding skills and composition fluency.…

Descriptors: Writing Processes, Writing (Composition), Essays, Models

Autoscoring Essays Based on Complex Networks

Peer reviewed

Direct link

Ke, Xiaohua; Zeng, Yongqiang; Luo, Haijiao – Journal of Educational Measurement, 2016

This article presents a novel method, the Complex Dynamics Essay Scorer (CDES), for automated essay scoring using complex network features. Texts produced by college students in China were represented as scale-free networks (e.g., a word adjacency model) from which typical network features, such as the in-/out-degrees, clustering coefficient (CC),…

Descriptors: Scoring, Automation, Essays, Networks

The Impact of Anonymization for Automated Essay Scoring

Peer reviewed

Direct link

Shermis, Mark D.; Lottridge, Sue; Mayfield, Elijah – Journal of Educational Measurement, 2015

This study investigated the impact of anonymizing text on predicted scores made by two kinds of automated scoring engines: one that incorporates elements of natural language processing (NLP) and one that does not. Eight data sets (N = 22,029) were used to form both training and test sets in which the scoring engines had access to both text and…

Descriptors: Scoring, Essays, Computer Assisted Testing, Natural Language Processing

Comparing the Effectiveness of Self-Paced and Collaborative Frame-of-Reference Training on Rater Accuracy in a Large-Scale Writing Assessment

Peer reviewed

Direct link

Raczynski, Kevin R.; Cohen, Allan S.; Engelhard, George, Jr.; Lu, Zhenqiu – Journal of Educational Measurement, 2015

There is a large body of research on the effectiveness of rater training methods in the industrial and organizational psychology literature. Less has been reported in the measurement literature on large-scale writing assessments. This study compared the effectiveness of two widely used rater training methods--self-paced and collaborative…

Descriptors: Interrater Reliability, Writing Evaluation, Training Methods, Pacing

A Hierarchical Rater Model for Constructed Responses, with a Signal Detection Rater Model

Peer reviewed

Direct link

DeCarlo, Lawrence T.; Kim, YoungKoung; Johnson, Matthew S. – Journal of Educational Measurement, 2011

The hierarchical rater model (HRM) recognizes the hierarchical structure of data that arises when raters score constructed response items. In this approach, raters' scores are not viewed as being direct indicators of examinee proficiency but rather as indicators of essay quality; the (latent categorical) quality of an examinee's essay in turn…

Descriptors: Responses, Essay Tests, Models, Scores

Rater Effects on Essay Scoring: A Multilevel Analysis of Severity Drift, Central Tendency, and Rater Experience

Peer reviewed

Direct link

Leckie, George; Baird, Jo-Anne – Journal of Educational Measurement, 2011

This study examined rater effects on essay scoring in an operational monitoring system from England's 2008 national curriculum English writing test for 14-year-olds. We fitted two multilevel models and analyzed: (1) drift in rater severity effects over time; (2) rater central tendency effects; and (3) differences in rater severity and central…

Descriptors: Scoring, Foreign Countries, National Curriculum, Writing Tests

The Use of Model Essays to Reduce Context Effects in Essay Scoring.

Peer reviewed

Hughes, David C.; Keeling, Brian – Journal of Educational Measurement, 1984

Several studies have shown that essays receive higher marks when preceded by poor quality scripts than when preceded by good quality scripts. This study investigated the effectiveness of providing scorers with model essays to reduce the influence of context. Context effects persisted despite the scoring procedures used. (Author/EGS)

Descriptors: Context Effect, Essay Tests, Essays, High Schools

Evaluating Rater Accuracy in Performance Assessments.

Peer reviewed

Englehard, George, Jr. – Journal of Educational Measurement, 1996

A new method for evaluating rater accuracy within the context of performance assessments is described. It uses an extended Rasch measurement model, FACETS, which is illustrated with 373 benchmark papers from the Georgia High School Graduation Writing Test rated by 20 operational raters and an expert panel. (SLD)

Descriptors: Essay Tests, Evaluation Methods, Evaluators, Performance Based Assessment

A Comparison of Direct and Indirect Assessments of Writing Skill.

Peer reviewed

Breland, Hunter M.; Gaynor, Judith L. – Journal of Educational Measurement, 1979

Over 2,000 writing samples were collected from four undergraduate institutions and compared, where possible, with scores on a multiple-choice test. High correlations between ratings of the writing samples and multiple-choice test scores were obtained. Samples contributed substantially to the prediction of both college grades and writing…

Descriptors: Achievement Tests, Comparative Testing, Correlation, Essay Tests

Primary Essay Tests

Peer reviewed

Veal, L. Ramon; Biesbrock, Edieann F. – Journal of Educational Measurement, 1971

Reports the development of an experimental essay test for use with young children based on the now-obsolete STEP Essay Tests. The child's essay on an assigned topic is rated in comparison with previously rated and scaled essays which serve as performance models. (Author)

Descriptors: Essay Tests, Grade 2, Grade 3, Test Construction

A Model of Rater Behavior in Essay Grading Based on Signal Detection Theory

Peer reviewed

Direct link

DeCarlo, Lawrence T. – Journal of Educational Measurement, 2005

An approach to essay grading based on signal detection theory (SDT) is presented. SDT offers a basis for understanding rater behavior with respect to the scoring of construct responses, in that it provides a theory of psychological processes underlying the raters' behavior. The approach also provides measures of the precision of the raters and the…

Descriptors: Validity, Simulation, Grading, Item Response Theory

Contrast Effects in Evaluating Essays.

Peer reviewed

Daly, John A.; Dickson-Markman, Fran – Journal of Educational Measurement, 1982

The effect of the quality of preceding essays on judgments of the quality of a subsequent essay was investigated. Inservice teachers as judges failed to produce consistently and unambiguously biased judgments. Results suggest the presence of a positive bias and the absence of a negative bias. (Author/PN)

Descriptors: Bias, Cognitive Processes, Context Clues, Context Effect

Previous Page | Next Page »

Pages: 1 | 2 | 3

Chase, Clinton I.	3
Hughes, David C.	3
Bridgeman, Brent	2
DeCarlo, Lawrence T.	2
Powers, Donald E.	2
Shermis, Mark D.	2
Akeju, S. A.	1
Baird, Jo-Anne	1
Bennett, Randy E.	1
Benton, Stephen L.	1
Biesbrock, Edieann F.	1
Blok, H.	1
Breland, Hunter M.	1
Cohen, Allan S.	1
Cross, Lawrence H.	1
Daly, John A.	1
Deane, Paul D.	1
Dickson-Markman, Fran	1
Eiting, Mindert H.	1
Engelhard, George, Jr.	1
Englehard, George, Jr.	1
Ferrara, Steve	1
Fowles, Mary E.	1
Frary, Robert B.	1
More ▼