NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 16 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Doewes, Afrizal; Kurdhi, Nughthoh Arfawi; Saxena, Akrati – International Educational Data Mining Society, 2023
Automated Essay Scoring (AES) tools aim to improve the efficiency and consistency of essay scoring by using machine learning algorithms. In the existing research work on this topic, most researchers agree that human-automated score agreement remains the benchmark for assessing the accuracy of machine-generated scores. To measure the performance of…
Descriptors: Essays, Writing Evaluation, Evaluators, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Zhang, Xiuyuan – AERA Online Paper Repository, 2019
The main purpose of the study is to evaluate the qualities of human essay ratings for a large-scale assessment using Rasch measurement theory. Specifically, Many-Facet Rasch Measurement (MFRM) was utilized to examine the rating scale category structure and provide important information about interpretations of ratings in the large-scale…
Descriptors: Essays, Evaluators, Writing Evaluation, Reliability
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Allen, Laura K.; Likens, Aaron D.; McNamara, Danielle S. – Grantee Submission, 2017
The current study examined the degree to which the quality and characteristics of students' essays could be modeled through dynamic natural language processing analyses. Undergraduate students (n = 131) wrote timed, persuasive essays in response to an argumentative writing prompt. Recurrent patterns of the words in the essays were then analyzed…
Descriptors: Writing Evaluation, Essays, Persuasive Discourse, Natural Language Processing
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Roscoe, Rod D.; Crossley, Scott A.; Snow, Erica L.; Varner, Laura K.; McNamara, Danielle S. – Grantee Submission, 2014
Automated essay scoring tools are often criticized on the basis of construct validity. Specifically, it has been argued that computational scoring algorithms may be unaligned to higher-level indicators of quality writing, such as writers' demonstrated knowledge and understanding of the essay topics. In this paper, we consider how and whether the…
Descriptors: Correlation, Essays, Scoring, Writing Evaluation
Peer reviewed Peer reviewed
Englehard, George, Jr. – Journal of Educational Measurement, 1996
A new method for evaluating rater accuracy within the context of performance assessments is described. It uses an extended Rasch measurement model, FACETS, which is illustrated with 373 benchmark papers from the Georgia High School Graduation Writing Test rated by 20 operational raters and an expert panel. (SLD)
Descriptors: Essay Tests, Evaluation Methods, Evaluators, Performance Based Assessment
Manalo, Jonathan R.; Wolfe, Edward W. – 2000
Recently, the Test of English as a Foreign Language (TOEFL) changed by including a writing section that gives the examinee an option between computer and handwritten formats to compose their responses. Unfortunately, this may introduce several potential sources of error that might reduce the reliability and validity of the scores. The seriousness…
Descriptors: Computer Assisted Testing, Essay Tests, Evaluators, Handwriting
Page, Ellis B.; Poggio, John P.; Keith, Timothy Z. – 1997
Most human gradings of essays are holistic, or "overall." Therefore, Project Essay Grade (PEG), an attempt to develop computerized grading of essays, has concentrated most of its research on overall grading. It has successfully simulated human judges. However, since computer grading is less expensive than human grading, PEG has also…
Descriptors: Computer Assisted Testing, Elementary Secondary Education, Essays, Evaluators
Wolfe, Edward W.; Kao, Chi-Wen – 1996
This paper reports the results of an analysis of the relationship between scorer behaviors and score variability. Thirty-six essay scorers were interviewed and asked to perform a think-aloud task as they scored 24 essays. Each comment made by a scorer was coded according to its content focus (i.e. appearance, assignment, mechanics, communication,…
Descriptors: Content Analysis, Educational Assessment, Essays, Evaluation Methods
Gyagenda, Ismail S.; Engelhard, George, Jr. – 1998
The purpose of this study was to examine rater, domain, and gender influences on the assessed quality of student writing using weighted and unweighted scores. Twenty rates were randomly selected from a group of 87 operational raters contracted to rate essays as part of the 1993 field test of the Georgia High School Writing Test. All of the raters…
Descriptors: Essay Tests, Evaluators, High School Students, High Schools
Gyagenda, Ismail S.; Engelhard, George, Jr. – 1998
The purpose of this study was to describe the Rasch model for measurement and apply the model to examine the relationship between raters, domains of written compositions, and student writing ability. Twenty raters were randomly selected from a group of 87 operational raters contracted to rate essays as part of the 1993 field test of the Georgia…
Descriptors: Difficulty Level, Essay Tests, Evaluators, High School Students
Wolfe, Edward W.; Feltovich, Brian – 1994
This paper presents a model of scored cognition that incorporates two types of mental models: models of performance (i.e., the criteria for judging performance) and models of scoring (i.e., the procedural scripts for scoring an essay). In Study 1, six novice and five experienced scorers wrote definitions of three levels of a 6-point holistic…
Descriptors: Cognitive Processes, Criteria, Essays, Evaluation Methods
Lunz, Mary E.; Stahl, John A. – 1990
Three examinations administered to medical students were analyzed to determine differences among severities of judges' assessments and among grading periods. The examinations included essay, clinical, and oral forms of the tests. Twelve judges graded the three essays for 32 examinees during a 4-day grading session, which was divided into eight…
Descriptors: Clinical Diagnosis, Comparative Testing, Difficulty Level, Essay Tests
Linacre, John M. – 1990
Rank ordering examinees is an easier task for judges than is awarding numerical ratings. A measurement model for rankings based on Rasch's objectivity axioms provides linear, sample-independent and judge-independent measures. Estimates of examinee measures are obtained from the data set of rankings, along with standard errors and fit statistics.…
Descriptors: Comparative Analysis, Error of Measurement, Essay Tests, Evaluators
Ferrara, Steven F. – 1987
The necessity of controlling the order in which trained essay raters for a statewide writing assessment program receive student essays was studied. The underlying theoretical question concerns possible rater bias caused by raters reading long strings of essays of homogeneous quality; this problem is usually referred to as context effect or…
Descriptors: Context Effect, Essay Tests, Evaluators, Graduation Requirements
Steele, Joe M. – 1979
The College Outcome Measures Project/American College Testing Program (COMP/ACT) Writing Assessment is described, and issues of validity and reliability in the assessment of writing samples using qualitative rating scales are explored. COMP/ACT is composed of three role-playing tasks in the social sciences, natural sciences, and arts, which are…
Descriptors: Adults, Essay Tests, Evaluators, Higher Education
Previous Page | Next Page ยป
Pages: 1  |  2