ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	14

Descriptor

Evaluators	16
Scoring	14
English (Second Language)	11
Language Tests	11
Second Language Learning	11
Computer Assisted Testing	9
Interrater Reliability	7
Essays	6
Scores	6
Correlation	5
Prompting	5
Statistical Analysis	5
Writing Tests	5
Computer Software	4
Cues	4
Evaluation Criteria	4
Evaluation Methods	4
Oral Language	4
Accuracy	3
Foreign Countries	3
Language Proficiency	3
Native Language	3
Regression (Statistics)	3
Scoring Rubrics	3
Writing Evaluation	3
More ▼

Source

ETS Research Report Series

Publication Type

Journal Articles	16
Reports - Research	15
Tests/Questionnaires	4
Reports - General	1

Education Level

Higher Education	2
Postsecondary Education	2
High Schools	1
Secondary Education	1

Audience

Location

Australia	1
California (Los Angeles)	1
Germany	1
Switzerland	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	10
Graduate Record Examinations	1

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

Effect of Immediate Elaborated Feedback on Rater Accuracy. Research Report. ETS RR-20-09

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal – ETS Research Report Series, 2020

Principles of skill acquisition dictate that raters should be provided with frequent feedback about their ratings. However, in current operational practice, raters rarely receive immediate feedback about their scores owing to the prohibitive effort required to generate such feedback. An approach for generating and administering feedback responses…

Descriptors: Feedback (Response), Evaluators, Accuracy, Scores

Does the Time between Scoring Sessions Impact Scoring Accuracy? An Evaluation of Constructed-Response Essay Responses on the "GRE"® General Test. Research Report. ETS RR-18-31

Peer reviewed
PDF on ERIC

Download full text

Finn, Bridgid; Wendler, Cathy; Ricker-Pedley, Kathryn L.; Arslan, Burcu – ETS Research Report Series, 2018

This report investigates whether the time between scoring sessions has an influence on operational and nonoperational scoring accuracy. The study evaluates raters' scoring accuracy on constructed-response essay responses for the "GRE"® General Test. Binomial linear mixed-effect models are presented that evaluate how the effect of various…

Descriptors: Intervals, Scoring, Accuracy, Essay Tests

Administrators' Uses of Teacher Observation Protocol in Different Rating Contexts. Research Report. ETS RR-18-18

Peer reviewed
PDF on ERIC

Download full text

Qi, Yi; Bell, Courtney A.; Jones, Nathan D.; Lewis, Jennifer M.; Witherspoon, Margaret W.; Redash, Amanda – ETS Research Report Series, 2018

Teacher observations are being used for high-stakes purposes in states across the country, and administrators often serve as raters in teacher evaluation systems. This paper examines how the cognitive aspects of administrators' use of an observation instrument, a modified version of Charlotte Danielson's Framework for Teaching, interact with the…

Descriptors: Teacher Evaluation, Classroom Observation Techniques, Observation, Evaluation Methods

Automated Essay Scoring at Scale: A Case Study in Switzerland and Germany. TOEFL® Research Report. RR-86. ETS RR-19-12

Peer reviewed
PDF on ERIC

Download full text

Rupp, André A.; Casabianca, Jodi M.; Krüger, Maleika; Keller, Stefan; Köller, Olaf – ETS Research Report Series, 2019

In this research report, we describe the design and empirical findings for a large-scale study of essay writing ability with approximately 2,500 high school students in Germany and Switzerland on the basis of 2 tasks with 2 associated prompts, each from a standardized writing assessment whose scoring involved both human and automated components.…

Descriptors: Automation, Foreign Countries, English (Second Language), Language Tests

A Prototype Public Speaking Skills Assessment: An Evaluation of Human-Scoring Quality. Research Report. ETS RR-15-36

Peer reviewed
PDF on ERIC

Download full text

Joe, Jilliam; Kitchen, Christopher; Chen, Lei; Feng, Gary – ETS Research Report Series, 2015

The purpose of this paper is to summarize the evaluation of human-scoring quality for an assessment of public speaking skills. Videotaped performances given by 17 speakers on 4 tasks were scored by expert and nonexpert raters who had extensive experience scoring performance-based and constructed-response assessments. The Public Speaking Competence…

Descriptors: Public Speaking, Communication Skills, Scoring, Scoring Rubrics

Automated Trait Scores for "TOEFL"® Writing Tasks. Research Report. ETS RR-15-14

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015

The "e-rater"® automated essay scoring system is used operationally in the scoring of "TOEFL iBT"® independent and integrated tasks. In this study we explored the psychometric added value of reporting four trait scores for each of these two tasks, beyond the total e-rater score.The four trait scores are word choice, grammatical…

Descriptors: Writing Tests, Scores, Language Tests, English (Second Language)

TOEFL11: A Corpus of Non-Native English. Research Report. ETS RR-13-24

Peer reviewed
PDF on ERIC

Download full text

Blanchard, Daniel; Tetreault, Joel; Higgins, Derrick; Cahill, Aoife; Chodorow, Martin – ETS Research Report Series, 2013

This report presents work on the development of a new corpus of non-native English writing. It will be useful for the task of native language identification, as well as grammatical error detection and correction, and automatic essay scoring. In this report, the corpus is described in detail.

Descriptors: Language Tests, Second Language Learning, English (Second Language), Writing Tests

Evaluation of the "e-rater"® Scoring Engine for the "TOEFL"® Independent and Integrated Prompts. Research Report. ETS RR-12-06

Peer reviewed
PDF on ERIC

Download full text

Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.; Davey, Tim; Bridgeman, Brent – ETS Research Report Series, 2012

Scoring models for the "e-rater"® system were built and evaluated for the "TOEFL"® exam's independent and integrated writing prompts. Prompt-specific and generic scoring models were built, and evaluation statistics, such as weighted kappas, Pearson correlations, standardized differences in mean scores, and correlations with…

Descriptors: Scoring, Prompting, Evaluators, Computer Software

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

The Relationship between Raters' Prior Language Study and the Evaluation of Foreign Language Speech Samples. TOEFL iBT® Research Report. TOEFL iBT-16. ETS Research Report RR-11-30

Peer reviewed
PDF on ERIC

Download full text

Winke, Paula; Gass, Susan; Myford, Carol – ETS Research Report Series, 2011

This study investigated whether raters' second language (L2) background and the first language (L1) of test takers taking the TOEFL iBT® Speaking test were related through scoring. After an initial 4-hour training period, a group of 107 raters (mostly of learners of Chinese, Korean, and Spanish), listened to a selection of 432 speech samples that…

Descriptors: Second Language Learning, Evaluators, Speech Tests, English (Second Language)

Developing Analytic Rating Guides for "TOEFL iBT"® Integrated Speaking Tasks. "TOEFL iBT"® Research Report, TOEFL iBT-20. ETS Research Report. RR-13-13

Peer reviewed
PDF on ERIC

Download full text

Jamieson, Joan; Poonpon, Kornwipa – ETS Research Report Series, 2013

Research and development of a new type of scoring rubric for the integrated speaking tasks of "TOEFL iBT"® are described. These "analytic rating guides" could be helpful if tasks modeled after those in TOEFL iBT were used for formative assessment, a purpose which is different from TOEFL iBT's primary use for admission…

Descriptors: Oral Language, Language Proficiency, Scaling, Scores

Analytic Scoring of TOEFL® CBT Essays: Scores from Humans and "E-rater"®. TOEFL® Research Reports. RR-81. ETS RR-08-01

Peer reviewed
PDF on ERIC

Download full text

Lee, Yong-Won; Gentile, Claudia; Kantor, Robert – ETS Research Report Series, 2008

The main purpose of the study was to investigate the distinctness and reliability of analytic (or multitrait) rating dimensions and their relationships to holistic scores and "e-rater"® essay feature variables in the context of the TOEFL® computer-based test (CBT) writing assessment. Data analyzed in the study were analytic and holistic…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Scoring

National Board for Professional Teaching Standards Bias-Reduction Training: Impact on Assessors' Awareness. Research Report. ETS RR-04-12

Peer reviewed
PDF on ERIC

Download full text

Wylie, E. Caroline; Szpara, Michelle Y. – ETS Research Report Series, 2004

This study is an in-depth investigation of the National Board for Professional Teaching Standards (NBPTS) bias-reduction training, from the perspective of assessors. The research examined how successful the bias training was in guiding assessors to recognize their biases and to identify actions to be used to reduce the impact of bias on their…

Descriptors: Standards, Bias, Training, Evaluators

Toward an Understanding of the Role of Speech Recognition in Nonnative Speech Assessment. TOEFL iBT Research Report. TOEFL iBT-02. ETS RR-07-02

Peer reviewed
PDF on ERIC

Download full text

Zechner, Klaus; Bejar, Isaac I.; Hemat, Ramin – ETS Research Report Series, 2007

The increasing availability and performance of computer-based testing has prompted more research on the automatic assessment of language and speaking proficiency. In this investigation, we evaluated the feasibility of using an off-the-shelf speech-recognition system for scoring speaking prompts from the LanguEdge field test of 2002. We first…

Descriptors: Role, Computer Assisted Testing, Language Proficiency, Oral Language

Investigating the Utility of Analytic Scoring for the TOEFL Academic Speaking Test (TAST). TOEFL iBT Research Report. TOEFL iBT-01. ETS RR-06-07

Peer reviewed
PDF on ERIC

Download full text

Xi, Xiaoming; Mollaun, Pam – ETS Research Report Series, 2006

This study explores the utility of analytic scoring for the TOEFL® Academic Speaking Test (TAST) in providing useful and reliable diagnostic information in three aspects of candidates' performance: delivery, language use, and topic development. G studies were used to investigate the dependability of the analytic scores, the distinctness of the…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Oral Language

Previous Page | Next Page »

Pages: 1 | 2

Attali, Yigal	2
Arslan, Burcu	1
Bejar, Isaac I.	1
Bell, Courtney A.	1
Blanchard, Daniel	1
Breyer, F. Jay	1
Bridgeman, Brent	1
Brown, Annie	1
Cahill, Aoife	1
Casabianca, Jodi M.	1
Chen, Lei	1
Chodorow, Martin	1
Davey, Tim	1
Feng, Gary	1
Finn, Bridgid	1
Gass, Susan	1
Gentile, Claudia	1
Hemat, Ramin	1
Higgins, Derrick	1
Iwashita, Noriko	1
Jamieson, Joan	1
Joe, Jilliam	1
Jones, Nathan D.	1
Kantor, Robert	1
Keller, Stefan	1
More ▼