ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	5

Descriptor

Evaluation Methods	7
Interrater Reliability	7
Scoring Formulas	7
Scoring	4
Essay Tests	3
Test Scoring Machines	3
Writing Evaluation	3
Accuracy	2
Automation	2
Essays	2
Evaluation Research	2
Models	2
Scoring Rubrics	2
Test Reliability	2
Test Validity	2
Writing Tests	2
Adult Education	1
Alternative Assessment	1
Beginning Teachers	1
College Entrance Examinations	1
College Freshmen	1
Comparative Analysis	1
Computation	1
Computer Assisted Testing	1
Correlation	1
More ▼

Source

Applied Measurement in…	1
Assessment in Education:…	1
ETS Research Report Series	1
Journal of Technology,…	1
Measurement and Evaluation in…	1
Personnel Psychology	1

Author

Aghbar, Ali-Asghar	1
Bardhoshi, Gerta	1
Barkaoui, Khaled	1
Ben-Simon, Anat	1
Bennett, Randy Elliott	1
Bridgeman, Brent	1
Cohen, Allan	1
Davey, Tim	1
Erford, Bradley T.	1
Hughes, Garry L.	1
Prien, Erich P.	1
Raczynski, Kevin	1
Ramineni, Chaitanya	1
Trapani, Catherine S.	1
Williamson, David M.	1
More ▼

Publication Type

Journal Articles	6
Reports - Research	6
Reports - Descriptive	1
Tests/Questionnaires	1

Education Level

Higher Education	2
Adult Education	1
Elementary Education	1
Grade 7	1
Grade 8	1
Postsecondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations

What Works Clearinghouse Rating

Showing all 7 results Save | Export

Appraising the Scoring Performance of Automated Essay Scoring Systems--Some Additional Considerations: Which Essays? Which Human Raters? Which Scores?

Peer reviewed

Direct link

Raczynski, Kevin; Cohen, Allan – Applied Measurement in Education, 2018

The literature on Automated Essay Scoring (AES) systems has provided useful validation frameworks for any assessment that includes AES scoring. Furthermore, evidence for the scoring fidelity of AES systems is accumulating. Yet questions remain when appraising the scoring performance of AES systems. These questions include: (a) which essays are…

Descriptors: Essay Tests, Test Scoring Machines, Test Validity, Evaluators

Processes and Procedures for Estimating Score Reliability and Precision

Peer reviewed

Direct link

Bardhoshi, Gerta; Erford, Bradley T. – Measurement and Evaluation in Counseling and Development, 2017

Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…

Descriptors: Scores, Test Reliability, Accuracy, Pretests Posttests

Evaluation of the "e-rater"® Scoring Engine for the "GRE"® Issue and Argument Prompts. Research Report. ETS RR-12-02

Peer reviewed
PDF on ERIC

Download full text

Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.; Davey, Tim; Bridgeman, Brent – ETS Research Report Series, 2012

Automated scoring models for the "e-rater"® scoring engine were built and evaluated for the "GRE"® argument and issue-writing tasks. Prompt-specific, generic, and generic with prompt-specific intercept scoring models were built and evaluation statistics such as weighted kappas, Pearson correlations, standardized difference in…

Descriptors: Scoring, Test Scoring Machines, Automation, Models

Effects of Marking Method and Rater Experience on ESL Essay Scores and Rater Performance

Peer reviewed

Direct link

Barkaoui, Khaled – Assessment in Education: Principles, Policy & Practice, 2011

This study examined the effects of marking method and rater experience on ESL (English as a Second Language) essay test scores and rater performance. Each of 31 novice and 29 experienced raters rated a sample of ESL essays both holistically and analytically. Essay scores were analysed using a multi-faceted Rasch model to compare test-takers'…

Descriptors: Writing Evaluation, Writing Tests, Essay Tests, Interrater Reliability

An Evaluation of Alternate Scoring Methods for the Mixed Standard Scale.

Peer reviewed

Hughes, Garry L.; Prien, Erich P. – Personnel Psychology, 1986

Investigated psychometric properties of three methods of scoring a Mixed Standard Scale performance evaluation: a patterned procedure, simple nonpatterned scoring procedure and procedure assigning differential weights to statements on the basis of scale values provided by subject matter experts. Found no differences in the score distribution…

Descriptors: Evaluation Methods, Interrater Reliability, Scoring, Scoring Formulas

Toward More Substantively Meaningful Automated Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Direct link

Ben-Simon, Anat; Bennett, Randy Elliott – Journal of Technology, Learning, and Assessment, 2007

This study evaluated a "substantively driven" method for scoring NAEP writing assessments automatically. The study used variations of an existing commercial program, e-rater[R], to compare the performance of three approaches to automated essay scoring: a "brute-empirical" approach in which variables are selected and weighted solely according to…

Descriptors: Writing Evaluation, Writing Tests, Scoring, Essays

Read-Comp as an Additional Measure of Writing Ability.

Aghbar, Ali-Asghar – 1986

The effectiveness of the "read-comp" technique in assessing writing ability and the usefulness of a rubric and procedure devised for scoring read-comp samples and essays were evaluated. Subjects were 100 freshman students enrolled in general and remedial English classes in a 6-week summer session at Indiana University of Pennsylvania.…

Descriptors: College Freshmen, Essay Tests, Evaluation Methods, Grading