ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	10

Source

ETS Research Report Series	5
Educational Testing Service	2
International Journal of…	2
Language Testing	1

Publication Type

Reports - Research	10
Journal Articles	8
Reports - Evaluative	2
Numerical/Quantitative Data	1
Tests/Questionnaires	1

Education Level

Higher Education	10
Postsecondary Education	9
Elementary Secondary Education	1

Audience

Location

China	1
India	1
Japan	1
South Korea	1
Taiwan	1

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	12
Test of English as a Foreign…	2

What Works Clearinghouse Rating

Graduate Record Examinations X

Showing all 12 results Save | Export

Using Automated Essay Scores as an Anchor When Equating Constructed Response Writing Tests

Peer reviewed

Direct link

Almond, Russell G. – International Journal of Testing, 2014

Assessments consisting of only a few extended constructed response items (essays) are not typically equated using anchor test designs as there are typically too few essay prompts in each form to allow for meaningful equating. This article explores the idea that output from an automated scoring program designed to measure writing fluency (a common…

Descriptors: Automation, Equated Scores, Writing Tests, Essay Tests

Understanding Mean Score Differences between the "e-rater"® Automated Scoring Engine and Humans for Demographically Based Groups in the "GRE"® General Test. ETS GRE® Board Research Report. ETS GRE®-18-01. ETS Research Report. RR-18-12

Peer reviewed
PDF on ERIC

Download full text

Ramineni, Chaitanya; Williamson, David – ETS Research Report Series, 2018

Notable mean score differences for the "e-rater"® automated scoring engine and for humans for essays from certain demographic groups were observed for the "GRE"® General Test in use before the major revision of 2012, called rGRE. The use of e-rater as a check-score model with discrepancy thresholds prevented an adverse impact…

Descriptors: Scores, Computer Assisted Testing, Test Scoring Machines, Automation

The Impact of Sampling Approach on Population Invariance in Automated Scoring of Essays. Research Report. ETS RR-13-18

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo – ETS Research Report Series, 2013

Many testing programs use automated scoring to grade essays. One issue in automated essay scoring that has not been examined adequately is population invariance and its causes. The primary purpose of this study was to investigate the impact of sampling in model calibration on population invariance of automated scores. This study analyzed scores…

Descriptors: Automation, Scoring, Essay Tests, Sampling

Scoring with the Computer: Alternative Procedures for Improving the Reliability of Holistic Essay Scoring

Peer reviewed

Direct link

Attali, Yigal; Lewis, Will; Steier, Michael – Language Testing, 2013

Automated essay scoring can produce reliable scores that are highly correlated with human scores, but is limited in its evaluation of content and other higher-order aspects of writing. The increased use of automated essay scoring in high-stakes testing underscores the need for human scoring that is focused on higher-order aspects of writing. This…

Descriptors: Scoring, Essay Tests, Reliability, High Stakes Tests

A Differential Word Use Measure for Content Analysis in Automated Essay Scoring. Research Report. ETS RR-11-36

Download full text

Attali, Yigal – Educational Testing Service, 2011

This paper proposes an alternative content measure for essay scoring, based on the "difference" in the relative frequency of a word in high-scored versus low-scored essays. The "differential word use" (DWU) measure is the average of these differences across all words in the essay. A positive value indicates the essay is using…

Descriptors: Scoring, Essay Tests, Word Frequency, Content Analysis

Evaluation of the "e-rater"® Scoring Engine for the "GRE"® Issue and Argument Prompts. Research Report. ETS RR-12-02

Peer reviewed
PDF on ERIC

Download full text

Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.; Davey, Tim; Bridgeman, Brent – ETS Research Report Series, 2012

Automated scoring models for the "e-rater"® scoring engine were built and evaluated for the "GRE"® argument and issue-writing tasks. Prompt-specific, generic, and generic with prompt-specific intercept scoring models were built and evaluation statistics such as weighted kappas, Pearson correlations, standardized difference in…

Descriptors: Scoring, Test Scoring Machines, Automation, Models

Use of Item Models in a Large-Scale Admissions Test: A Case Study

Peer reviewed

Direct link

Sinharay, Sandip; Johnson, Matthew S. – International Journal of Testing, 2008

"Item models" (LaDuca, Staples, Templeton, & Holzman, 1986) are classes from which it is possible to generate items that are equivalent/isomorphic to other items from the same model (e.g., Bejar, 1996, 2002). They have the potential to produce large numbers of high-quality items at reduced cost. This article introduces data from an…

Descriptors: College Entrance Examinations, Case Studies, Test Items, Models

Evaluating the Construct-Coverage of the e-rater[R] Scoring Engine. Research Report. ETS RR-09-01

Download full text

Quinlan, Thomas; Higgins, Derrick; Wolff, Susanne – Educational Testing Service, 2009

This report evaluates the construct coverage of the e-rater[R[ scoring engine. The matter of construct coverage depends on whether one defines writing skill, in terms of process or product. Originally, the e-rater engine consisted of a large set of components with a proven ability to predict human holistic scores. By organizing these capabilities…

Descriptors: Guides, Writing Skills, Factor Analysis, Writing Tests

Automated Scoring of Short-Answer Open-Ended GRE® Subject Test Items. ETS GRE® Board Research Report No. 04-02. ETS RR-08-20

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Powers, Don; Freedman, Marshall; Harrison, Marissa; Obetz, Susan – ETS Research Report Series, 2008

This report describes the development, administration, and scoring of open-ended variants of GRE® Subject Test items in biology and psychology. These questions were administered in a Web-based experiment to registered examinees of the respective Subject Tests. The questions required a short answer of 1-3 sentences, and responses were automatically…

Descriptors: College Entrance Examinations, Graduate Study, Scoring, Test Construction

Inside Sourcefinder: Predicting the Acceptability Status of Candidate Reading-Comprehension Source Documents. Research Report. ETS RR-06-24

Peer reviewed
PDF on ERIC

Download full text

Sheehan, Kathleen M.; Kostin, Irene; Futagi, Yoko; Hemat, Ramin; Zuckerman, Daniel – ETS Research Report Series, 2006

This paper describes the development, implementation, and evaluation of an automated system for predicting the acceptability status of candidate reading-comprehension stimuli extracted from a database of journal and magazine articles. The system uses a combination of classification and regression techniques to predict the probability that a given…

Descriptors: Automation, Prediction, Reading Comprehension, Classification

Machine-Scorable Complex Constructed-Response Quantitative Items: Agreement between Expert System and Human Raters' Scores. GRE Board Professional Report No. 88-07aP.

Download full text

Sebrechts, Marc M.; And Others – 1991

This study evaluated agreement between expert system and human scores on 12 algebra word problems taken by Graduate Record Examinations (GRE) General Test examinees from a general sample of 285 and a study sample of 30. Problems were drawn from three content classes (rate x time, work, and interest) and presented in four constructed-response…

Descriptors: Algebra, Automation, College Students, Computer Assisted Testing

The Accuracy of Automatic Qualitative Analyses of Constructed-Response Solutions to Algebra Word Problems. GRE Board Professional Report No. 91-03P.

Download full text

Bennett, Randy Elliot; Sebrechts, Marc M. – 1994

This study evaluated expert system diagnoses of examinees' solutions to complex constructed-response algebra word problems. Problems were presented to three samples (30 college students each), each of which had taken the Graduate Record Examinations General Test. One sample took the problems in paper-and-pencil form and the other two on computer.…

Descriptors: Algebra, Automation, Classification, College Entrance Examinations

Automation	12
Scoring	11
College Entrance Examinations	10
Essay Tests	6
Graduate Study	5
Test Scoring Machines	5
Computer Assisted Testing	4
Correlation	3
Interrater Reliability	3
Models	3
Scores	3
Test Construction	3
Test Items	3
Algebra	2
Classification	2
College Students	2
Constructed Response	2
Demography	2
Essays	2
Evaluation Research	2
Expert Systems	2
Factor Analysis	2
Higher Education	2
Measurement	2
Persuasive Discourse	2
More ▼

Attali, Yigal	3
Ramineni, Chaitanya	2
Sebrechts, Marc M.	2
Almond, Russell G.	1
Bennett, Randy Elliot	1
Bridgeman, Brent	1
Davey, Tim	1
Freedman, Marshall	1
Futagi, Yoko	1
Harrison, Marissa	1
Hemat, Ramin	1
Higgins, Derrick	1
Johnson, Matthew S.	1
Kostin, Irene	1
Lewis, Will	1
Obetz, Susan	1
Powers, Don	1
Quinlan, Thomas	1
Sheehan, Kathleen M.	1
Sinharay, Sandip	1
Steier, Michael	1
Trapani, Catherine S.	1
Williamson, David	1
Williamson, David M.	1
More ▼