NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
No Child Left Behind Act 20011
Assessments and Surveys
National Assessment of…1
What Works Clearinghouse Rating
Showing all 11 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Shermis, Mark D. – Journal of Educational Measurement, 2022
One of the challenges of discussing validity arguments for machine scoring of essays centers on the absence of a commonly held definition and theory of good writing. At best, the algorithms attempt to measure select attributes of writing and calibrate them against human ratings with the goal of accurate prediction of scores for new essays.…
Descriptors: Scoring, Essays, Validity, Writing Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Shermis, Mark D. – Applied Measurement in Education, 2018
This article employs the Common European Framework Reference for Language Acquisition (CEFR) as a basis for evaluating writing in the context of machine scoring. The CEFR was designed as a framework for evaluating proficiency levels of speaking for the 49 languages comprising the European Union. The intent was to impact language instruction so…
Descriptors: Scoring, Automation, Essays, Language Proficiency
Peer reviewed Peer reviewed
Direct linkDirect link
Shermis, Mark D.; Lottridge, Sue; Mayfield, Elijah – Journal of Educational Measurement, 2015
This study investigated the impact of anonymizing text on predicted scores made by two kinds of automated scoring engines: one that incorporates elements of natural language processing (NLP) and one that does not. Eight data sets (N = 22,029) were used to form both training and test sets in which the scoring engines had access to both text and…
Descriptors: Scoring, Essays, Computer Assisted Testing, Natural Language Processing
Peer reviewed Peer reviewed
Direct linkDirect link
Shermis, Mark D.; Mao, Liyang; Mulholland, Matthew; Kieftenbeld, Vincent – International Journal of Testing, 2017
This study uses the feature sets employed by two automated scoring engines to determine if a "linguistic profile" could be formulated that would help identify items that are likely to exhibit differential item functioning (DIF) based on linguistic features. Sixteen items were administered to 1200 students where demographic information…
Descriptors: Computer Assisted Testing, Scoring, Hypothesis Testing, Essays
Shermis, Mark D.; Garvan, Cynthia Wilson; Diao, Yanbo – Online Submission, 2008
This study was an expanded replication of an earlier endeavor (Shermis, Burstein, & Bliss, 2004) to document the writing outcomes associated with automated essay scoring. The focus of the current study was on determining whether exposure to multiple writing prompts facilitated writing production variables (Essay Score, Essay Length, and Number…
Descriptors: Scoring, Essays, Grade 8, Grade 6
Peer reviewed Peer reviewed
Direct linkDirect link
Shermis, Mark D.; Shneyderman, Aleksandr; Attali, Yigal – Assessment in Education: Principles, Policy & Practice, 2008
This study was designed to examine the extent to which "content" accounts for variance in scores assigned in automated essay scoring protocols. Specifically it was hypothesised that certain writing genre would emphasise content more than others. Data were drawn from 1668 essays calibrated at two grade levels (6 and 8) using "e-rater[TM]", an…
Descriptors: Predictor Variables, Test Scoring Machines, Essays, Grade 8
Shermis, Mark D.; Barrera, Felicia D. – 2002
This paper describes ongoing work in automated essay scoring that will extend the applicability of models that are currently used for short-essay documents (i.e., less than 500 words). Sponsored by the Fund for Improvement of Post-Secondary Education (FIPSE), the project would create norms for documents that might normally be found in an…
Descriptors: Computer Software, Essays, Portfolios (Background Materials), Scoring
Shermis, Mark D.; DiVesta, Francis J. – Rowman & Littlefield Publishers, Inc., 2011
"Classroom Assessment in Action" clarifies the multi-faceted roles of measurement and assessment and their applications in a classroom setting. Comprehensive in scope, Shermis and Di Vesta explain basic measurement concepts and show students how to interpret the results of standardized tests. From these basic concepts, the authors then…
Descriptors: Student Evaluation, Standardized Tests, Scores, Measurement
Peer reviewed Peer reviewed
Shermis, Mark D.; Koch, Chantal Mees; Page, Ellis B.; Keith, Timothy Z.; Harrington, Susanmarie – Educational and Psychological Measurement, 2002
Studied the use of an automated grader to score essays holistically and by rating traits through two experiments that evaluated 807 Web-based essays and then compared 386 essays to evaluations by 6 human raters. Results show the essay grading software to be efficient and able to grade about six documents a second. (SLD)
Descriptors: Automation, College Students, Computer Software, Essays
Shermis, Mark D.; Raymat, Marylou Vallina; Barrera, Felicia – 2003
This paper provides an overview of some recent work in automated essay scoring that focuses on writing improvement at the postsecondary level. The paper illustrates the Vantage Intellimetric (tm) automated essay scorer that is being used as part of a Fund for the Improvement of Postsecondary Education (FIPSE) project that uses technology to grade…
Descriptors: College Students, Essays, Higher Education, Portfolio Assessment
Shermis, Mark D.; Koch, Chantal Mees; Page, Ellis B.; Keith, Timothy Z.; Harrington, Susanmarie – 1999
This study used Project Essay Grade (PEG) to evaluate essays both holistically and with the rating of traits (content, organization, style, mechanics, and creativity) for Web-based student essays that serve as placement tests at a large Midwestern university. In addition, the use of a TopicScore, or measure of topic content for each assignment,…
Descriptors: Automation, College Students, Construct Validity, Essays