ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	7

Descriptor

Comparative Analysis	13
Essay Tests	13
Interrater Reliability	13
Scoring	8
Writing (Composition)	5
Writing Evaluation	5
Evaluation Methods	4
College Students	3
Computer Assisted Testing	3
Correlation	3
Evaluators	3
Higher Education	3
Writing Tests	3
Automation	2
Computer Software	2
Educational Technology	2
English (Second Language)	2
Examiners	2
Foreign Countries	2
Grading	2
Handwriting	2
Models	2
Secondary Education	2
Secondary School Students	2
Student Evaluation	2
More ▼

Source

Applied Measurement in…	2
ALT-J: Research in Learning…	1
Assessment in Education:…	1
British Journal of…	1
ETS Research Report Series	1
ProQuest LLC	1
ReCALL	1

Publication Type

Reports - Research	10
Journal Articles	7
Reports - Evaluative	3
Numerical/Quantitative Data	2
Speeches/Meeting Papers	2
Dissertations/Theses -…	1
Tests/Questionnaires	1

Education Level

Higher Education	3
Postsecondary Education	2
Secondary Education	2
Adult Education	1
Elementary Secondary Education	1
High Schools	1

Audience

Location

Hong Kong	1
United Kingdom (Scotland)	1

Laws, Policies, & Programs

Assessments and Surveys

Graduate Management Admission…	1
Praxis Series	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Statistically Comparing the Performance of Multiple Automated Raters across Multiple Items

Peer reviewed

Direct link

Kieftenbeld, Vincent; Boyer, Michelle – Applied Measurement in Education, 2017

Automated scoring systems are typically evaluated by comparing the performance of a single automated rater item-by-item to human raters. This presents a challenge when the performance of multiple raters needs to be compared across multiple items. Rankings could depend on specifics of the ranking procedure; observed differences could be due to…

Descriptors: Automation, Scoring, Comparative Analysis, Test Items

Evaluation of "e-rater"® for the "Praxis I"®Writing Test. Research Report. ETS RR-15-03

Peer reviewed
PDF on ERIC

Download full text

Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M. – ETS Research Report Series, 2015

Automated scoring models were trained and evaluated for the essay task in the "Praxis I"® writing test. Prompt-specific and generic "e-rater"® scoring models were built, and evaluation statistics, such as quadratic weighted kappa, Pearson correlation, and standardized differences in mean scores, were examined to evaluate the…

Descriptors: Writing Tests, Licensing Examinations (Professions), Teacher Competency Testing, Scoring

Reported Usage and Perceived Value of Advanced Placement English Language and Composition Curricular Requirements by High School and College Assessors of the Essay Portion of the English Language and Composition Advanced Placement Exam

Direct link

Holifield-Scott, April – ProQuest LLC, 2011

A study was conducted to determine the extent to which high school and college/university Advanced Placement English Language and Composition readers value and implement the curricular requirements of Advanced Placement English Language and Composition. The participants were 158 readers of the 2010 Advanced Placement English Language and…

Descriptors: Advanced Placement, English Instruction, Writing (Composition), English Curriculum

Effects of Marking Method and Rater Experience on ESL Essay Scores and Rater Performance

Peer reviewed

Direct link

Barkaoui, Khaled – Assessment in Education: Principles, Policy & Practice, 2011

This study examined the effects of marking method and rater experience on ESL (English as a Second Language) essay test scores and rater performance. Each of 31 novice and 29 experienced raters rated a sample of ESL essays both holistically and analytically. Essay scores were analysed using a multi-faceted Rasch model to compare test-takers'…

Descriptors: Writing Evaluation, Writing Tests, Essay Tests, Interrater Reliability

Marking Essays on Screen: An Investigation into the Reliability of Marking Extended Subjective Texts

Peer reviewed

Direct link

Johnson, Martin; Nadas, Rita; Bell, John F. – British Journal of Educational Technology, 2010

There is a growing body of research literature that considers how the mode of assessment, either computer-based or paper-based, might affect candidates' performances. Despite this, there is a fairly narrow literature that shifts the focus of attention to those making assessment judgements and which considers issues of assessor consistency when…

Descriptors: English Literature, Examiners, Evaluation Research, Evaluators

Typing Compared with Handwriting for Essay Examinations at University: Letting the Students Choose

Peer reviewed

Direct link

Mogey, Nora; Paterson, Jessie; Burk, John; Purcell, Michael – ALT-J: Research in Learning Technology, 2010

Students at the University of Edinburgh do almost all their work on computers, but at the end of the semester they are examined by handwritten essays. Intuitively it would be appealing to allow students the choice of handwriting or typing, but this raises a concern that perhaps this might not be "fair"--that the choice a student makes,…

Descriptors: Handwriting, Essay Tests, Interrater Reliability, Grading

Experimenting with a Computer Essay-Scoring Program Based on ESL Student Writing Scripts

Peer reviewed

Direct link

Coniam, David – ReCALL, 2009

This paper describes a study of the computer essay-scoring program BETSY. While the use of computers in rating written scripts has been criticised in some quarters for lacking transparency or lack of fit with how human raters rate written scripts, a number of essay rating programs are available commercially, many of which claim to offer comparable…

Descriptors: Writing Tests, Scoring, Foreign Countries, Interrater Reliability

Remote Scoring of Essays. College Board Report No. 88-3.

Breland, Hunter M.; Jones, Robert J. – 1988

The reliability, validity, and score discrepancies of 94 expository essays scored in conference versus remote settings were studied. Focus was on comparing holistic ratings obtained in both settings. Essays written by college freshmen on two different topics were scored by readers working in a conference setting and by different readers working in…

Descriptors: College Freshmen, Comparative Analysis, Conferences, Essay Tests

Comparing Computerized and Human Scoring of Students' Essays.

Download full text

Sireci, Stephen G.; Rizavi, Saba – 2000

Although computer-based testing is becoming popular, many of these tests are limited to the use of selected-response item formats due to the difficulty in mechanically scoring constructed-response items. This limitation is unfortunate because many constructs, such as writing proficiency, can be measured more directly using items that require…

Descriptors: College Students, Comparative Analysis, Computer Uses in Education, Essay Tests

A Comparison of the Graded Response and Partial Credit Models for Assessing Writing Ability.

Download full text

De Ayala, R. J.; And Others – 1989

The graded response (GR) model of Samejima (1969) and the partial credit model (PC) of Masters (1982) were fitted to identical writing samples that were holistically scored. The performance and relative benefits of each model were then evaluated. Writing samples were both expository and narrative. Data were from statewide assessments of secondary…

Descriptors: Comparative Analysis, Essay Tests, Holistic Evaluation, Interrater Reliability

Comparability of Scores on Word-Processed and Handwritten Essays on the Graduate Management Admissions Test.

Download full text

Bridgeman, Brent; Cooper, Peter – 1998

Essays for the Graduate Management Admissions Test must be written with a word processor (except in some foreign countries). The test sponsors, the Graduate Management Admissions Council, believed that this is fair because some word processing skill is a prerequisite for advanced management education. Furthermore, it might also be unfair to…

Descriptors: College Entrance Examinations, College Students, Comparative Analysis, Essay Tests

Student Learning Outcomes Assessment for English 101 at the Wenatchee Campus of Wenatchee Valley College: The Relationship of Student Outcomes and Rater Consistency in English 101, Winter 1990 and Spring 1990.

Download full text

Tiffany, Gerald E.; And Others – 1991

In 1991, a student learning outcomes assessment was conducted at Wenatchee Valley College, Washington. All English 101 students in the winter and spring quarters of 1990 wrote a 2-hour final exam. Winter quarter students wrote on the same topic while spring quarter students wrote on one of three randomly assigned topics. Five English 101…

Descriptors: Community Colleges, Comparative Analysis, Curriculum Evaluation, Essay Tests

Cross-State Comparability of Judgments of Student Writing: Results from the New Standards Project.

Peer reviewed

Linn, Robert L.; And Others – Applied Measurement in Education, 1992

Ten states participated in a cross-state scoring workshop in 1991, evaluating writing from elementary school, middle school, and high school students. Correlation of scores assigned by readers from one state with those from readers from another state were generally quite high. Implications for defining common standards are discussed. (SLD)

Descriptors: Comparative Analysis, Correlation, Elementary School Students, Elementary Secondary Education

Barkaoui, Khaled	1
Bell, John F.	1
Boyer, Michelle	1
Breland, Hunter M.	1
Bridgeman, Brent	1
Burk, John	1
Coniam, David	1
Cooper, Peter	1
De Ayala, R. J.	1
Holifield-Scott, April	1
Johnson, Martin	1
Jones, Robert J.	1
Kieftenbeld, Vincent	1
Linn, Robert L.	1
Mogey, Nora	1
Nadas, Rita	1
Paterson, Jessie	1
Purcell, Michael	1
Ramineni, Chaitanya	1
Rizavi, Saba	1
Sireci, Stephen G.	1
Tiffany, Gerald E.	1
Trapani, Catherine S.	1
Williamson, David M.	1
More ▼