ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	13
Since 2006 (last 20 years)	25

Descriptor

Interrater Reliability	25
Scoring	20
Correlation	9
English (Second Language)	8
Language Tests	8
Second Language Learning	8
College Entrance Examinations	7
Computer Assisted Testing	7
Evaluators	7
Scores	7
Automation	6
Graduate Study	6
Accuracy	5
Essay Tests	5
Test Construction	5
Test Items	5
Writing Tests	5
Essays	4
Foreign Countries	4
Language Usage	4
Models	4
Test Reliability	4
Comparative Analysis	3
Language Proficiency	3
Oral Language	3
More ▼

Source

ETS Research Report Series

Publication Type

Journal Articles	25
Reports - Research	22
Tests/Questionnaires	5
Reports - Descriptive	1
Reports - Evaluative	1
Reports - General	1

Education Level

Higher Education	12
Postsecondary Education	11
Secondary Education	3
High Schools	1

Audience

Location

Germany	3
China	2
Japan	2
New Jersey	2
South Korea	2
Australia	1
California (Los Angeles)	1
Colombia	1
France	1
India	1
Jordan	1
Mexico	1
Netherlands	1
Pennsylvania	1
Switzerland	1
Turkey	1
United States	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	7
Graduate Record Examinations	5
SAT (College Admission Test)	2
ACT Assessment	1
Praxis Series	1
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 25 results Save | Export

Best Practices for Constructed-Response Scoring. Research Report. ETS RR-22-17

Peer reviewed
PDF on ERIC

Download full text

McCaffrey, Daniel F.; Casabianca, Jodi M.; Ricker-Pedley, Kathryn L.; Lawless, René R.; Wendler, Cathy – ETS Research Report Series, 2022

This document describes a set of best practices for developing, implementing, and maintaining the critical process of scoring constructed-response tasks. These practices address both the use of human raters and automated scoring systems as part of the scoring process and cover the scoring of written, spoken, performance, or multimodal responses.…

Descriptors: Best Practices, Scoring, Test Format, Computer Assisted Testing

Investigating Constructed-Response Scoring over Time: The Effects of Study Design on Trend Rescore Statistics. Research Report. ETS RR-22-15

Peer reviewed
PDF on ERIC

Download full text

Donoghue, John R.; McClellan, Catherine A.; Hess, Melinda R. – ETS Research Report Series, 2022

When constructed-response items are administered for a second time, it is necessary to evaluate whether the current Time B administration's raters have drifted from the scoring of the original administration at Time A. To study this, Time A papers are sampled and rescored by Time B scorers. Commonly the scores are compared using the proportion of…

Descriptors: Item Response Theory, Test Construction, Scoring, Testing

Certified to Evaluate: Exploring Administrator Accuracy and Beliefs in Teacher Observation. Research Report. ETS RR-21-05

Peer reviewed
PDF on ERIC

Download full text

Jones, Nathan; Bell, Courtney; Qi, Yi; Lewis, Jennifer; Kirui, David; Stickler, Leslie; Redash, Amanda – ETS Research Report Series, 2021

The observation systems being used in all 50 states require administrators to learn to accurately and reliably score their teachers' instruction using standardized observation systems. Although the literature on observation systems is growing, relatively few studies have examined the outcomes of trainings focused on developing administrators'…

Descriptors: Observation, Standardized Tests, Teacher Evaluation, Test Reliability

Examining the Calibration Process for Raters of the "GRE"® General Test. ETS GRE® Board Research Report. GRE®-19-01. Research Report Series. ETS RR-19-09

Peer reviewed
PDF on ERIC

Download full text

Wendler, Cathy; Glazer, Nancy; Cline, Frederick – ETS Research Report Series, 2019

One of the challenges in scoring constructed-response (CR) items and tasks is ensuring that rater drift does not occur during or across scoring windows. Rater drift reflects changes in how raters interpret and use established scoring criteria to assign essay scores. Calibration is a process used to help control rater drift and, as such, serves as…

Descriptors: College Entrance Examinations, Graduate Study, Accuracy, Test Reliability

Does the Time between Scoring Sessions Impact Scoring Accuracy? An Evaluation of Constructed-Response Essay Responses on the "GRE"® General Test. Research Report. ETS RR-18-31

Peer reviewed
PDF on ERIC

Download full text

Finn, Bridgid; Wendler, Cathy; Ricker-Pedley, Kathryn L.; Arslan, Burcu – ETS Research Report Series, 2018

This report investigates whether the time between scoring sessions has an influence on operational and nonoperational scoring accuracy. The study evaluates raters' scoring accuracy on constructed-response essay responses for the "GRE"® General Test. Binomial linear mixed-effect models are presented that evaluate how the effect of various…

Descriptors: Intervals, Scoring, Accuracy, Essay Tests

A Generalizability Theory Study to Examine Sources of Score Variance in Third-Party Evaluations Used in Decision-Making for Graduate School Admissions. ETS GRE® Board Research Report. ETS GRE®-18-03. ETS RR-18-37

Peer reviewed
PDF on ERIC

Download full text

McCaffrey, Daniel F.; Oliveri, Maria Elena; Holtzman, Steven – ETS Research Report Series, 2018

Scores from noncognitive measures are increasingly valued for their utility in helping to inform postsecondary admissions decisions. However, their use has presented challenges because of faking, response biases, or subjectivity, which standardized third-party evaluations (TPEs) can help minimize. Analysts and researchers using TPEs, however, need…

Descriptors: Generalizability Theory, Scores, College Admission, Admission Criteria

A Preliminary Investigation of the Factors Related to the Design and Scoring of Video-Based Oral Communication Performance Tasks in Higher Education. Research Report. ETS RR-18-09

Peer reviewed
PDF on ERIC

Download full text

Roohr, Katrina Crotts; Burkander, Kri; Mao, Liyang – ETS Research Report Series, 2018

Oral communication has been identified as an important skill by higher education institutions and by the workforce community. Despite its importance, minimal research has been conducted around the development of tasks to measure oral communication skills and behaviors. The purpose of this preliminary study is to evaluate the different factors…

Descriptors: Speech Communication, Video Technology, Test Construction, Scoring

Developing an Innovative Elicited Imitation Task for Efficient English Proficiency Assessment. TOEFL® Research Report. RR-96. ETS RR-21-24

Peer reviewed
PDF on ERIC

Download full text

Davis, Larry; Norris, John – ETS Research Report Series, 2021

The elicited imitation task (EIT), in which language learners listen to a series of spoken sentences and repeat each one verbatim, is a commonly used measure of language proficiency in second language acquisition research. The "TOEFL® Essentials"™ test includes an EIT as a holistic measure of speaking proficiency, referred to as the…

Descriptors: Task Analysis, Language Proficiency, Speech Communication, Language Tests

Automated Essay Scoring at Scale: A Case Study in Switzerland and Germany. TOEFL® Research Report. RR-86. ETS RR-19-12

Peer reviewed
PDF on ERIC

Download full text

Rupp, André A.; Casabianca, Jodi M.; Krüger, Maleika; Keller, Stefan; Köller, Olaf – ETS Research Report Series, 2019

In this research report, we describe the design and empirical findings for a large-scale study of essay writing ability with approximately 2,500 high school students in Germany and Switzerland on the basis of 2 tasks with 2 associated prompts, each from a standardized writing assessment whose scoring involved both human and automated components.…

Descriptors: Automation, Foreign Countries, English (Second Language), Language Tests

Agreement of Teachers on Evaluating Assessments of Learning Progressions in English Language Arts and Mathematics. Research Report. ETS RR-18-11

Peer reviewed
PDF on ERIC

Download full text

van Rijn, Peter; Graf, Edith Aurora; Arieli-Attali, Meirav; Song, Yi – ETS Research Report Series, 2018

In this study, we explored the extent to which teachers agree on the ordering and separation of levels of two different learning progressions (LPs) in English language arts (ELA) and mathematics. In a panel meeting akin to a standard-setting procedure, we asked teachers to link the items and responses of summative educational assessments to LP…

Descriptors: Teacher Attitudes, Student Evaluation, Summative Evaluation, Language Arts

Use of Automated Scoring in Spoken Language Assessments for Test Takers with Speech Impairments. Research Report. ETS RR-17-42

Peer reviewed
PDF on ERIC

Download full text

Loukina, Anastassia; Buzick, Heather – ETS Research Report Series, 2017

This study is an evaluation of the performance of automated speech scoring for speakers with documented or suspected speech impairments. Given that the use of automated scoring of open-ended spoken responses is relatively nascent and there is little research to date that includes test takers with disabilities, this small exploratory study focuses…

Descriptors: Automation, Scoring, Language Tests, Speech Tests

Developing a Machine-Supported Coding System for Constructed-Response Items in PISA. Research Report. ETS RR-17-47

Peer reviewed
PDF on ERIC

Download full text

Yamamoto, Kentaro; He, Qiwei; Shin, Hyo Jeong; von Davier, Mattias – ETS Research Report Series, 2017

Approximately a third of the Programme for International Student Assessment (PISA) items in the core domains (math, reading, and science) are constructed-response items and require human coding (scoring). This process is time-consuming, expensive, and prone to error as often (a) humans code inconsistently, and (b) coding reliability in…

Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students

A Prototype Public Speaking Skills Assessment: An Evaluation of Human-Scoring Quality. Research Report. ETS RR-15-36

Peer reviewed
PDF on ERIC

Download full text

Joe, Jilliam; Kitchen, Christopher; Chen, Lei; Feng, Gary – ETS Research Report Series, 2015

The purpose of this paper is to summarize the evaluation of human-scoring quality for an assessment of public speaking skills. Videotaped performances given by 17 speakers on 4 tasks were scored by expert and nonexpert raters who had extensive experience scoring performance-based and constructed-response assessments. The Public Speaking Competence…

Descriptors: Public Speaking, Communication Skills, Scoring, Scoring Rubrics

Evaluation of "e-rater"® for the "Praxis I"®Writing Test. Research Report. ETS RR-15-03

Peer reviewed
PDF on ERIC

Download full text

Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M. – ETS Research Report Series, 2015

Automated scoring models were trained and evaluated for the essay task in the "Praxis I"® writing test. Prompt-specific and generic "e-rater"® scoring models were built, and evaluation statistics, such as quadratic weighted kappa, Pearson correlation, and standardized differences in mean scores, were examined to evaluate the…

Descriptors: Writing Tests, Licensing Examinations (Professions), Teacher Competency Testing, Scoring

Development and Validation of the Written Communication Assessment of the "HEIghten"® Outcomes Assessment Suite. Research Report. ETS RR-17-53

Peer reviewed
PDF on ERIC

Download full text

Rios, Joseph A.; Sparks, Jesse R.; Zhang, Mo; Liu, Ou Lydia – ETS Research Report Series, 2017

Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack…

Descriptors: Test Construction, Test Validity, Writing Tests, College Outcomes Assessment

Previous Page | Next Page »

Pages: 1 | 2

Wendler, Cathy	3
Zhang, Mo	3
Attali, Yigal	2
Bejar, Isaac I.	2
Casabianca, Jodi M.	2
Holtzman, Steven	2
Madnani, Nitin	2
McCaffrey, Daniel F.	2
Ramineni, Chaitanya	2
Ricker-Pedley, Kathryn L.	2
Trapani, Catherine S.	2
Williamson, David M.	2
Arieli-Attali, Meirav	1
Arslan, Burcu	1
Bell, Courtney	1
Breyer, F. Jay	1
Bridgeman, Brent	1
Briller, Vladimir	1
Burkander, Kri	1
Burstein, Jill	1
Buzick, Heather	1
Chen, Lei	1
Cline, Frederick	1
Davey, Tim	1
Davis, Larry	1
More ▼