ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	4

Descriptor

Automation	7
Scoring	7
Validity	4
Interrater Reliability	3
Models	3
Test Scoring Machines	3
Classification	2
Correlation	2
Essay Tests	2
Evaluation	2
Evaluation Methods	2
Quality Control	2
Architects	1
Artificial Intelligence	1
College Entrance Examinations	1
Comparative Analysis	1
Constructed Response	1
Data	1
Demography	1
Essays	1
Evaluation Research	1
Expertise	1
Graduate Study	1
Guidelines	1
High Stakes Tests	1
More ▼

Source

ETS Research Report Series	3
Applied Measurement in…	1
Educational Measurement:…	1
International Journal of…	1

Author

Williamson, David M.	7
Bejar, Isaac I.	3
Breyer, F. Jay	2
Ramineni, Chaitanya	2
Sax, Anne	2
Trapani, Catherine S.	2
Bridgeman, Brent	1
Davey, Tim	1
Hone, Anne S.	1
Miller, Susan	1
Trapani, Catherine	1
Xi, Xiaoming	1
Zhang, Mo	1
More ▼

Publication Type

Journal Articles	6
Reports - Research	5
Reports - Descriptive	1
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	1
Praxis Series	1

What Works Clearinghouse Rating

Showing all 7 results Save | Export

Evaluation of "e-rater"® for the "Praxis I"®Writing Test. Research Report. ETS RR-15-03

Peer reviewed
PDF on ERIC

Download full text

Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M. – ETS Research Report Series, 2015

Automated scoring models were trained and evaluated for the essay task in the "Praxis I"® writing test. Prompt-specific and generic "e-rater"® scoring models were built, and evaluation statistics, such as quadratic weighted kappa, Pearson correlation, and standardized differences in mean scores, were examined to evaluate the…

Descriptors: Writing Tests, Licensing Examinations (Professions), Teacher Competency Testing, Scoring

Comparison of "E-Rater"[R] Automated Essay Scoring Model Calibration Methods Based on Distributional Targets

Peer reviewed

Direct link

Zhang, Mo; Williamson, David M.; Breyer, F. Jay; Trapani, Catherine – International Journal of Testing, 2012

This article describes two separate, related studies that provide insight into the effectiveness of "e-rater" score calibration methods based on different distributional targets. In the first study, we developed and evaluated a new type of "e-rater" scoring model that was cost-effective and applicable under conditions of absent human rating and…

Descriptors: Automation, Scoring, Models, Essay Tests

A Framework for Evaluation and Use of Automated Scoring

Peer reviewed

Direct link

Williamson, David M.; Xi, Xiaoming; Breyer, F. Jay – Educational Measurement: Issues and Practice, 2012

A framework for evaluation and use of automated scoring of constructed-response tasks is provided that entails both evaluation of automated scoring as well as guidelines for implementation and maintenance in the context of constantly evolving technologies. Consideration of validity issues and challenges associated with automated scoring are…

Descriptors: Automation, Scoring, Evaluation, Guidelines

Evaluation of the "e-rater"® Scoring Engine for the "GRE"® Issue and Argument Prompts. Research Report. ETS RR-12-02

Peer reviewed
PDF on ERIC

Download full text

Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.; Davey, Tim; Bridgeman, Brent – ETS Research Report Series, 2012

Automated scoring models for the "e-rater"® scoring engine were built and evaluated for the "GRE"® argument and issue-writing tasks. Prompt-specific, generic, and generic with prompt-specific intercept scoring models were built and evaluation statistics such as weighted kappas, Pearson correlations, standardized difference in…

Descriptors: Scoring, Test Scoring Machines, Automation, Models

Automated Tools for Subject Matter Expert Evaluation of Automated Scoring. Research Report. ETS RR-04-14

Peer reviewed
PDF on ERIC

Download full text

Williamson, David M.; Bejar, Isaac I.; Sax, Anne – ETS Research Report Series, 2004

As automated scoring of complex constructed-response examinations reaches operational status, the process of evaluating the quality of resultant scores, particularly in contrast to scores of expert human graders, becomes as complex as the data itself. Using a vignette from the Architectural Registration Examination (ARE), this paper explores the…

Descriptors: Automation, Scoring, Tests, Classification

Classification Trees for Quality Control Processes in Automated Constructed Response Scoring.

Download full text

Williamson, David M.; Hone, Anne S.; Miller, Susan; Bejar, Isaac I. – 1998

As the automated scoring of constructed responses reaches operational status, the issue of monitoring the scoring process becomes a primary concern, particularly when the goal is to have automated scoring operate completely unassisted by humans. Using a vignette from the Architectural Registration Examination and data for 326 cases with both human…

Descriptors: Architects, Automation, Classification, Constructed Response

Automated Tools for Subject Matter Expert Evaluation of Automated Scoring

Peer reviewed

Direct link

Williamson, David M.; Bejar, Isaac I.; Sax, Anne – Applied Measurement in Education, 2004

Descriptors: Validity, Scoring, Scores, Evaluation Methods