NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Type
Reports - Evaluative15
Journal Articles12
Speeches/Meeting Papers2
Tests/Questionnaires1
Audience
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of…2
What Works Clearinghouse Rating
Showing all 15 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Doewes, Afrizal; Kurdhi, Nughthoh Arfawi; Saxena, Akrati – International Educational Data Mining Society, 2023
Automated Essay Scoring (AES) tools aim to improve the efficiency and consistency of essay scoring by using machine learning algorithms. In the existing research work on this topic, most researchers agree that human-automated score agreement remains the benchmark for assessing the accuracy of machine-generated scores. To measure the performance of…
Descriptors: Essays, Writing Evaluation, Evaluators, Accuracy
Guskey, Thomas R.; Jung, Lee Ann – Educational Leadership, 2016
Many educators consider grades calculated from statistical algorithms more accurate, objective, and reliable than grades they calculate themselves. But in this research, the authors first asked teachers to use their professional judgment to choose a summary grade for hypothetical students. When the researchers compared the teachers' grade with the…
Descriptors: Grading, Computer Assisted Testing, Interrater Reliability, Grades (Scholastic)
Peer reviewed Peer reviewed
Direct linkDirect link
McCurry, Doug – Assessing Writing, 2010
This article considers the claim that machine scoring of writing test responses agrees with human readers as much as humans agree with other humans. These claims about the reliability of machine scoring of writing are usually based on specific and constrained writing tasks, and there is reason for asking whether machine scoring of writing requires…
Descriptors: Writing Tests, Scoring, Interrater Reliability, Computer Assisted Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Mogey, Nora; Paterson, Jessie; Burk, John; Purcell, Michael – ALT-J: Research in Learning Technology, 2010
Students at the University of Edinburgh do almost all their work on computers, but at the end of the semester they are examined by handwritten essays. Intuitively it would be appealing to allow students the choice of handwriting or typing, but this raises a concern that perhaps this might not be "fair"--that the choice a student makes,…
Descriptors: Handwriting, Essay Tests, Interrater Reliability, Grading
Peer reviewed Peer reviewed
Direct linkDirect link
Coniam, David – Educational Research and Evaluation, 2009
This paper describes a study comparing paper-based marking (PBM) and onscreen marking (OSM) in Hong Kong utilising English language essay scripts drawn from the live 2007 Hong Kong Certificate of Education Examination (HKCEE) Year 11 English Language Writing Paper. In the study, 30 raters from the 2007 HKCEE Writing Paper marked on paper 100…
Descriptors: Student Attitudes, Foreign Countries, Essays, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Coniam, David – ReCALL, 2009
This paper describes a study of the computer essay-scoring program BETSY. While the use of computers in rating written scripts has been criticised in some quarters for lacking transparency or lack of fit with how human raters rate written scripts, a number of essay rating programs are available commercially, many of which claim to offer comparable…
Descriptors: Writing Tests, Scoring, Foreign Countries, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Shaw, Stuart – E-Learning, 2008
Computer-assisted assessment offers many benefits over traditional paper methods. However, in transferring from one medium to another, it is crucial to ascertain the extent to which the new medium may alter the nature of traditional assessment practice or affect marking reliability. Whilst there is a substantial body of research comparing marking…
Descriptors: Construct Validity, Writing Instruction, Computer Assisted Testing, Student Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Pare, D. E.; Joordens, S. – Journal of Computer Assisted Learning, 2008
As class sizes increase, methods of assessments shift from costly traditional approaches (e.g. expert-graded writing assignments) to more economic and logistically feasible methods (e.g. multiple-choice testing, computer-automated scoring, or peer assessment). While each method of assessment has its merits, it is peer assessment in particular,…
Descriptors: Writing Assignments, Undergraduate Students, Teaching Assistants, Peer Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Wen, Meichun Lydia; Tsai, Chin-Chung – Teaching in Higher Education, 2008
Online or web-based peer assessment is a valuable and effective way to help the learner to examine his or her learning progress, and teachers need to be familiar with the practice before they use it in their classrooms. Therefore, the purpose of our study was to design an online peer assessment activity for 37 inservice science and mathematics…
Descriptors: Teacher Education Curriculum, Education Courses, Peer Evaluation, Research Methodology
Peer reviewed Peer reviewed
Page, Ellis Batten – Journal of Experimental Education, 1994
National Assessment of Educational Progress writing sample essays from 1988 and 1990 (495 and 599 essays) were subjected to computerized grading and human ratings. Cross-validation suggests that computer scoring is superior to a two-judge panel, a finding encouraging for large programs of essay evaluation. (SLD)
Descriptors: Computer Assisted Testing, Computer Software, Essays, Evaluation Methods
Bennett, Randy Elliot; Rock, Donald A. – 1993
Formulating-Hypotheses (F-H) items present a situation and ask the examinee to generate as many explanations for it as possible. This study examined the generalizability, validity, and examinee perceptions of a computer-delivered version of the task. Eight F-H questions were administered to 192 graduate students. Half of the items restricted…
Descriptors: Computer Assisted Testing, Difficulty Level, Generalizability Theory, Graduate Students
Peer reviewed Peer reviewed
Direct linkDirect link
McGhee, Debbie E.; Lowell, Nana – New Directions for Teaching and Learning, 2003
This study compares mean ratings, inter-rater reliabilities, and the factor structure of items for online and paper student-rating forms from the University of Washington's Instructional Assessment System. (Contains 3 figures and 2 tables.)
Descriptors: Psychometrics, Factor Structure, Student Evaluation of Teacher Performance, Test Items
Cason, Gerald J.; And Others – 1987
The Objective Test Scoring and Performance Rating (OTS-PR) system is a fully integrated set of 70 modular FORTRAN programs run on a VAX-8530 computer. Even with no knowledge of computers, the user can implement OTS-PR to score multiple-choice tests, include scores from external sources such as hand-scored essays or scores from nationally…
Descriptors: Clinical Experience, Computer Assisted Testing, Educational Assessment, Essay Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, H. K. – Assessing Writing, 2004
This study aimed to comprehensively investigate the impact of a word-processor on an ESL writing assessment, covering comparison of inter-rater reliability, the quality of written products, the writing process across different testing occasions using different writing media, and students' perception of a computer-delivered test. Writing samples of…
Descriptors: Writing Evaluation, Student Attitudes, Writing Tests, Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Yang, Yongwei; Buckendahl, Chad W.; Juszkiewicz, Piotr J.; Bhola, Dennison S. – Journal of Applied Testing Technology, 2005
With the continual progress of computer technologies, computer automated scoring (CAS) has become a popular tool for evaluating writing assessments. Research of applications of these methodologies to new types of performance assessments is still emerging. While research has generally shown a high agreement of CAS system generated scores with those…
Descriptors: Scoring, Validity, Interrater Reliability, Comparative Analysis