ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	14

Source

ETS Research Report Series	4
Applied Measurement in…	3
Educational Testing Service	2
Language Testing	2
CEA Forum	1
College Student Journal	1
Educational Assessment	1
International Journal of…	1
Journal of Educational…	1

Publication Type

Journal Articles	14
Reports - Research	11
Reports - Evaluative	5
Reports - General	1
Speeches/Meeting Papers	1

Education Level

Higher Education	12
Postsecondary Education	10
Elementary Secondary Education	1

Audience

Researchers

Location

China	1
India	1
Japan	1
Pennsylvania (Pittsburgh)	1
South Korea	1
Taiwan	1

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	17
Test of English as a Foreign…	6
College Level Examination…	1
Praxis Series	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

Applying Cognitive Theory to the Human Essay Rating Process

Peer reviewed

Direct link

Finn, Bridgid; Arslan, Burcu; Walsh, Matthew – Applied Measurement in Education, 2020

To score an essay response, raters draw on previously trained skills and knowledge about the underlying rubric and score criterion. Cognitive processes such as remembering, forgetting, and skill decay likely influence rater performance. To investigate how forgetting influences scoring, we evaluated raters' scoring accuracy on TOEFL and GRE essays.…

Descriptors: Epistemology, Essay Tests, Evaluators, Cognitive Processes

Does the Time between Scoring Sessions Impact Scoring Accuracy? An Evaluation of Constructed-Response Essay Responses on the "GRE"® General Test. Research Report. ETS RR-18-31

Peer reviewed
PDF on ERIC

Download full text

Finn, Bridgid; Wendler, Cathy; Ricker-Pedley, Kathryn L.; Arslan, Burcu – ETS Research Report Series, 2018

This report investigates whether the time between scoring sessions has an influence on operational and nonoperational scoring accuracy. The study evaluates raters' scoring accuracy on constructed-response essay responses for the "GRE"® General Test. Binomial linear mixed-effect models are presented that evaluate how the effect of various…

Descriptors: Intervals, Scoring, Accuracy, Essay Tests

Implementing a Contributory Scoring Approach for the "GRE"® Analytical Writing Section: A Comprehensive Empirical Investigation. Research Report. ETS RR-17-14

Peer reviewed
PDF on ERIC

Download full text

Breyer, F. Jay; Rupp, André A.; Bridgeman, Brent – ETS Research Report Series, 2017

In this research report, we present an empirical argument for the use of a contributory scoring approach for the 2-essay writing assessment of the analytical writing section of the "GRE"® test in which human and machine scores are combined for score creation at the task and section levels. The approach was designed to replace a currently…

Descriptors: College Entrance Examinations, Scoring, Essay Tests, Writing Evaluation

Using Automated Essay Scores as an Anchor When Equating Constructed Response Writing Tests

Peer reviewed

Direct link

Almond, Russell G. – International Journal of Testing, 2014

Assessments consisting of only a few extended constructed response items (essays) are not typically equated using anchor test designs as there are typically too few essay prompts in each form to allow for meaningful equating. This article explores the idea that output from an automated scoring program designed to measure writing fluency (a common…

Descriptors: Automation, Equated Scores, Writing Tests, Essay Tests

Understanding Mean Score Differences between the "e-rater"® Automated Scoring Engine and Humans for Demographically Based Groups in the "GRE"® General Test. ETS GRE® Board Research Report. ETS GRE®-18-01. ETS Research Report. RR-18-12

Peer reviewed
PDF on ERIC

Download full text

Ramineni, Chaitanya; Williamson, David – ETS Research Report Series, 2018

Notable mean score differences for the "e-rater"® automated scoring engine and for humans for essays from certain demographic groups were observed for the "GRE"® General Test in use before the major revision of 2012, called rGRE. The use of e-rater as a check-score model with discrepancy thresholds prevented an adverse impact…

Descriptors: Scores, Computer Assisted Testing, Test Scoring Machines, Automation

Advancing the Validity Argument for Standardized Writing Tests Using Quantitative Rhetorical Analysis

Peer reviewed

Direct link

Beigman Klebanov, Beata; Ramineni, Chaitanya; Kaufer, David; Yeoh, Paul; Ishizaki, Suguru – Language Testing, 2019

Essay writing is a common type of constructed-response task used frequently in standardized writing assessments. However, the impromptu timed nature of the essay writing tests has drawn increasing criticism for the lack of authenticity for real-world writing in classroom and workplace settings. The goal of this paper is to contribute evidence to a…

Descriptors: Test Validity, Writing Tests, Writing Skills, Persuasive Discourse

The Impact of Sampling Approach on Population Invariance in Automated Scoring of Essays. Research Report. ETS RR-13-18

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo – ETS Research Report Series, 2013

Many testing programs use automated scoring to grade essays. One issue in automated essay scoring that has not been examined adequately is population invariance and its causes. The primary purpose of this study was to investigate the impact of sampling in model calibration on population invariance of automated scores. This study analyzed scores…

Descriptors: Automation, Scoring, Essay Tests, Sampling

Scoring with the Computer: Alternative Procedures for Improving the Reliability of Holistic Essay Scoring

Peer reviewed

Direct link

Attali, Yigal; Lewis, Will; Steier, Michael – Language Testing, 2013

Automated essay scoring can produce reliable scores that are highly correlated with human scores, but is limited in its evaluation of content and other higher-order aspects of writing. The increased use of automated essay scoring in high-stakes testing underscores the need for human scoring that is focused on higher-order aspects of writing. This…

Descriptors: Scoring, Essay Tests, Reliability, High Stakes Tests

Comparison of Human and Machine Scoring of Essays: Differences by Gender, Ethnicity, and Country

Peer reviewed

Direct link

Bridgeman, Brent; Trapani, Catherine; Attali, Yigal – Applied Measurement in Education, 2012

Essay scores generated by machine and by human raters are generally comparable; that is, they can produce scores with similar means and standard deviations, and machine scores generally correlate as highly with human scores as scores from one human correlate with scores from another human. Although human and machine essay scores are highly related…

Descriptors: Scoring, Essay Tests, College Entrance Examinations, High Stakes Tests

Keeping up with the Standards: What One English Professor Learned from Taking Every Standardized Exam in His Discipline

Peer reviewed
PDF on ERIC

Download full text

Brown, Kevin – CEA Forum, 2015

In this article, the author describes his project to take every standardized exam English majors students take. During the summer and fall semesters of 2012, the author signed up for and took the GRE General Test, the Praxis Content Area Exam (English Language, Literature, and Composition: Content Knowledge), the Senior Major Field Tests in…

Descriptors: College Faculty, College English, Test Preparation, Standardized Tests

A Differential Word Use Measure for Content Analysis in Automated Essay Scoring. Research Report. ETS RR-11-36

Download full text

Attali, Yigal – Educational Testing Service, 2011

This paper proposes an alternative content measure for essay scoring, based on the "difference" in the relative frequency of a word in high-scored versus low-scored essays. The "differential word use" (DWU) measure is the average of these differences across all words in the essay. A positive value indicates the essay is using…

Descriptors: Scoring, Essay Tests, Word Frequency, Content Analysis

Evaluating the Construct-Coverage of the e-rater[R] Scoring Engine. Research Report. ETS RR-09-01

Download full text

Quinlan, Thomas; Higgins, Derrick; Wolff, Susanne – Educational Testing Service, 2009

This report evaluates the construct coverage of the e-rater[R[ scoring engine. The matter of construct coverage depends on whether one defines writing skill, in terms of process or product. Originally, the e-rater engine consisted of a large set of components with a proven ability to predict human holistic scores. By organizing these capabilities…

Descriptors: Guides, Writing Skills, Factor Analysis, Writing Tests

Effect of Test-Expectancy and Word Bank Availability on Test Performance

Peer reviewed

Direct link

Glass, Laura A.; Clause, Christopher B.; Kreiner, David S. – College Student Journal, 2007

We examined test-expectancy as it applies to fill-in-the-blank tests. We randomly assigned 60 college students to take a fill-in-the-blank vocabulary test in one of three conditions. Two groups took the test with a word bank available; we told one group but not the other that they would have a word bank. The third group took the test with no word…

Descriptors: Student Empowerment, College Students, Tests, Expectation

Use of Writing Samples on Standardized Tests: Susceptibility to Rule-Based Coaching and the Resulting Effects on Score Improvement

Peer reviewed

Direct link

Hardison, Chaitra M.; Sackett, Paul R. – Applied Measurement in Education, 2008

Despite the growing use of writing assessments in standardized tests, little is known about coaching effects on writing assessments. Therefore, this study tested the effects of short-term coaching on standardized writing tests, and the transfer of those effects to other writing genres. College freshmen were randomly assigned to either training…

Descriptors: Control Groups, Group Membership, College Freshmen, Writing Tests

Test-Takers' Judgments of Essay Prompts: Perceptions and Performance.

Peer reviewed

Powers, Donald E.; Fowles, Mary E. – Educational Assessment, 1999

Gathered judgments of 253 minority group students and 268 other college students who took the Graduate Record Examination about essay prompts being considered for use in a graduate writing test. Identified several features that underlie examinee perceptions of essay prompts, especially the extent to which prompts allow examinees to draw on their…

Descriptors: College Entrance Examinations, College Students, Essay Tests, Experience

Previous Page | Next Page »

Pages: 1 | 2

Essay Tests	17
College Entrance Examinations	14
Scoring	12
Automation	6
Graduate Study	6
Computer Assisted Testing	5
Writing Tests	5
Correlation	4
Interrater Reliability	4
Standardized Tests	4
Writing Skills	4
College Students	3
English (Second Language)	3
Higher Education	3
Language Tests	3
Test Construction	3
Test Scoring Machines	3
Accuracy	2
Evaluators	2
Factor Analysis	2
Foreign Countries	2
Graduate Students	2
High Stakes Tests	2
Persuasive Discourse	2
Regression (Statistics)	2
More ▼

Attali, Yigal	3
Arslan, Burcu	2
Bridgeman, Brent	2
Finn, Bridgid	2
Fowles, Mary E.	2
Powers, Donald E.	2
Ramineni, Chaitanya	2
Almond, Russell G.	1
Beigman Klebanov, Beata	1
Breyer, F. Jay	1
Brown, Kevin	1
Camp, Roberta	1
Carlson, Sybil B.	1
Clause, Christopher B.	1
Glass, Laura A.	1
Hardison, Chaitra M.	1
Higgins, Derrick	1
Ishizaki, Suguru	1
Kaufer, David	1
Kreiner, David S.	1
Lewis, Will	1
Quinlan, Thomas	1
Ricker-Pedley, Kathryn L.	1
Rupp, André A.	1
Sackett, Paul R.	1
More ▼