ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	2

Descriptor

Correlation	2
Essays	2
Interrater Reliability	2
Program Validation	2
Scoring	2
Test Scoring Machines	2
Weighted Scores	2
Academic Standards	1
Achievement Rating	1
Automation	1
College Entrance Examinations	1
Criterion Referenced Tests	1
Data	1
Demography	1
Evaluation Methods	1
Evaluation Research	1
Evaluators	1
Expertise	1
Graduate Study	1
Models	1
Novices	1
Persuasive Discourse	1
Pretesting	1
Regression (Statistics)	1
Scoring Formulas	1
More ▼

Source

Applied Measurement in…	1
ETS Research Report Series	1

Author

Bridgeman, Brent	1
Davey, Tim	1
Duchnowski, Matthew P.	1
Escoffery, David S.	1
Powers, Donald E.	1
Ramineni, Chaitanya	1
Trapani, Catherine S.	1
Williamson, David M.	1

Publication Type

Journal Articles	2
Reports - Research	2

Education Level

Higher Education	2
Postsecondary Education	2

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations

What Works Clearinghouse Rating

Showing all 2 results Save | Export

Validating Automated Essay Scoring: A (Modest) Refinement of the "Gold Standard"

Peer reviewed

Direct link

Powers, Donald E.; Escoffery, David S.; Duchnowski, Matthew P. – Applied Measurement in Education, 2015

By far, the most frequently used method of validating (the interpretation and use of) automated essay scores has been to compare them with scores awarded by human raters. Although this practice is questionable, human-machine agreement is still often regarded as the "gold standard." Our objective was to refine this model and apply it to…

Descriptors: Essays, Test Scoring Machines, Program Validation, Criterion Referenced Tests

Evaluation of the "e-rater"® Scoring Engine for the "GRE"® Issue and Argument Prompts. Research Report. ETS RR-12-02

Peer reviewed
PDF on ERIC

Download full text

Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.; Davey, Tim; Bridgeman, Brent – ETS Research Report Series, 2012

Automated scoring models for the "e-rater"® scoring engine were built and evaluated for the "GRE"® argument and issue-writing tasks. Prompt-specific, generic, and generic with prompt-specific intercept scoring models were built and evaluation statistics such as weighted kappas, Pearson correlations, standardized difference in…

Descriptors: Scoring, Test Scoring Machines, Automation, Models