ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	21

Descriptor

English (Second Language)	21
Evaluators	21
Scoring	21
Language Tests	20
Second Language Learning	20
Computer Assisted Testing	10
Correlation	7
Interrater Reliability	7
Oral Language	7
Scores	7
Essays	6
Foreign Countries	6
Writing Evaluation	6
Accuracy	5
Comparative Analysis	5
Cues	5
Prompting	5
Writing Tests	5
Computer Software	4
Evaluation Criteria	4
Language Proficiency	4
Native Language	4
Questionnaires	4
Rating Scales	4
Speech Communication	4
More ▼

Source

ETS Research Report Series	8
Language Testing	4
Applied Measurement in…	1
Grantee Submission	1
JALT CALL Journal	1
Journal of Pan-Pacific…	1
Language Assessment Quarterly	1
Language Learning	1
ProQuest LLC	1
SAGE Open	1
Taiwan Journal of TESOL	1
More ▼

Publication Type

Journal Articles	20
Reports - Research	19
Tests/Questionnaires	6
Dissertations/Theses -…	1
Reports - Evaluative	1

Education Level

Higher Education	4
Postsecondary Education	3
High Schools	1
Secondary Education	1

Audience

Location

Iran	3
Germany	1
India	1
Japan (Tokyo)	1
Switzerland	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	21
International English…	2
Graduate Record Examinations	1
Test of English for…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

Towards More Valid Scoring Criteria for Integrated Reading-Writing and Listening-Writing Summary Tasks

Peer reviewed

Direct link

Chan, Sathena; May, Lyn – Language Testing, 2023

Despite the increased use of integrated tasks in high-stakes academic writing assessment, research on rating criteria which reflect the unique construct of integrated summary writing skills is comparatively rare. Using a mixed-method approach of expert judgement, text analysis, and statistical analysis, this study examines writing features that…

Descriptors: Scoring, Writing Evaluation, Reading Tests, Listening Skills

Applying Cognitive Theory to the Human Essay Rating Process

Peer reviewed

Direct link

Finn, Bridgid; Arslan, Burcu; Walsh, Matthew – Applied Measurement in Education, 2020

To score an essay response, raters draw on previously trained skills and knowledge about the underlying rubric and score criterion. Cognitive processes such as remembering, forgetting, and skill decay likely influence rater performance. To investigate how forgetting influences scoring, we evaluated raters' scoring accuracy on TOEFL and GRE essays.…

Descriptors: Epistemology, Essay Tests, Evaluators, Cognitive Processes

Rater Dominance in Discussion as a Resolution Method

Peer reviewed
PDF on ERIC

Download full text

Ahmadi, Alireza – Taiwan Journal of TESOL, 2020

Rater subjectivity has long been an intriguing topic. The use of discussion as a resolution method is a practical way to reduce this subjectivity. However, the efficacy of discussion depends on whether different raters get equally engaged in it or one rater tends to dominate others. This study investigated whether and how rater dominance occurs in…

Descriptors: Evaluators, Interrater Reliability, Discussion, Discourse Analysis

Automated Essay Scoring at Scale: A Case Study in Switzerland and Germany. TOEFL® Research Report. RR-86. ETS RR-19-12

Peer reviewed
PDF on ERIC

Download full text

Rupp, André A.; Casabianca, Jodi M.; Krüger, Maleika; Keller, Stefan; Köller, Olaf – ETS Research Report Series, 2019

In this research report, we describe the design and empirical findings for a large-scale study of essay writing ability with approximately 2,500 high school students in Germany and Switzerland on the basis of 2 tasks with 2 associated prompts, each from a standardized writing assessment whose scoring involved both human and automated components.…

Descriptors: Automation, Foreign Countries, English (Second Language), Language Tests

For a Greater Good: Bias Analysis in Writing Assessment

Peer reviewed

Direct link

Ahmadi Shirazi, Masoumeh – SAGE Open, 2019

Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters' experience, language background, and the choice of essay prompt as potential…

Descriptors: Foreign Countries, Language Tests, Test Bias, Essay Tests

Assessment Behavior and Perceptions of Raters in Paired and Group Oral Interaction

Peer reviewed
PDF on ERIC

Download full text

Negishi, Junko – Journal of Pan-Pacific Association of Applied Linguistics, 2015

The study considers the assessment of L2 English learners by trained raters in paired and group oral assessments in comparison to an individual, monologue assessment, to determine 1) the degree to which raters assign pairs/groups shared (the same) scores and the degree to which raters give individual members of pairs/groups higher or lower as…

Descriptors: Evaluators, English (Second Language), Second Language Learning, Scores

A Comparative Analysis of Face to Face Instruction vs. Telegram Mobile Instruction in Terms of Narrative Writing

Peer reviewed
PDF on ERIC

Download full text

Heidari, Jamshid; Khodabandeh, Farzaneh; Soleimani, Hassan – JALT CALL Journal, 2018

The emergence of computer technology in English language teaching has paved the way for teachers' application of Mobile Assisted Language Learning (mall) and its advantages in teaching. This study aimed to compare the effectiveness of the face to face instruction with Telegram mobile instruction. Based on a toefl test, 60 English foreign language…

Descriptors: Comparative Analysis, Conventional Instruction, Teaching Methods, Computer Assisted Instruction

A Study on the Impact of Fatigue on Human Raters When Scoring Speaking Responses

Peer reviewed

Direct link

Ling, Guangming; Mollaun, Pamela; Xi, Xiaoming – Language Testing, 2014

The scoring of constructed responses may introduce construct-irrelevant factors to a test score and affect its validity and fairness. Fatigue is one of the factors that could negatively affect human performance in general, yet little is known about its effects on a human rater's scoring quality on constructed responses. In this study, we compared…

Descriptors: Evaluators, Fatigue (Biology), Scoring, Performance

Automated Trait Scores for "TOEFL"® Writing Tasks. Research Report. ETS RR-15-14

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015

The "e-rater"® automated essay scoring system is used operationally in the scoring of "TOEFL iBT"® independent and integrated tasks. In this study we explored the psychometric added value of reporting four trait scores for each of these two tasks, beyond the total e-rater score.The four trait scores are word choice, grammatical…

Descriptors: Writing Tests, Scores, Language Tests, English (Second Language)

Investigating Differences between American and Indian Raters in Assessing TOEFL iBT Speaking Tasks

Peer reviewed

Direct link

Wei, Jing; Llosa, Lorena – Language Assessment Quarterly, 2015

This article reports on an investigation of the role raters' language background plays in raters' assessment of test takers' speaking ability. Specifically, this article examines differences between American and Indian raters in their scores and scoring processes when rating Indian test takers' responses to the Test of English as a Foreign…

Descriptors: North Americans, Indians, Evaluators, English (Second Language)

Linguistic Microfeatures to Predict L2 Writing Proficiency: A Case Study in Automated Writing Evaluation

Peer reviewed
PDF on ERIC

Download full text

Direct link

Crossley, Scott A.; Kyle, Kristopher; Allen, Laura K.; Guo, Liang; McNamara, Danielle S. – Grantee Submission, 2014

This study investigates the potential for linguistic microfeatures related to length, complexity, cohesion, relevance, topic, and rhetorical style to predict L2 writing proficiency. Computational indices were calculated by two automated text analysis tools (Coh- Metrix and the Writing Assessment Tool) and used to predict human essay ratings in a…

Descriptors: Computational Linguistics, Essays, Scoring, Writing Evaluation

TOEFL11: A Corpus of Non-Native English. Research Report. ETS RR-13-24

Peer reviewed
PDF on ERIC

Download full text

Blanchard, Daniel; Tetreault, Joel; Higgins, Derrick; Cahill, Aoife; Chodorow, Martin – ETS Research Report Series, 2013

This report presents work on the development of a new corpus of non-native English writing. It will be useful for the task of native language identification, as well as grammatical error detection and correction, and automatic essay scoring. In this report, the corpus is described in detail.

Descriptors: Language Tests, Second Language Learning, English (Second Language), Writing Tests

Using Raters from India to Score a Large-Scale Speaking Test

Peer reviewed

Direct link

Xi, Xiaoming; Mollaun, Pam – Language Learning, 2011

We investigated the scoring of the Speaking section of the Test of English as a Foreign Language[TM] Internet-based (TOEFL iBT[R]) test by speakers of English and one or more Indian languages. We explored the extent to which raters from India, after being trained and certified, were able to score the TOEFL examinees with mixed first languages…

Descriptors: Speech Communication, Scoring, Foreign Countries, English (Second Language)

Evaluation of the "e-rater"® Scoring Engine for the "TOEFL"® Independent and Integrated Prompts. Research Report. ETS RR-12-06

Peer reviewed
PDF on ERIC

Download full text

Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.; Davey, Tim; Bridgeman, Brent – ETS Research Report Series, 2012

Scoring models for the "e-rater"® system were built and evaluated for the "TOEFL"® exam's independent and integrated writing prompts. Prompt-specific and generic scoring models were built, and evaluation statistics, such as weighted kappas, Pearson correlations, standardized differences in mean scores, and correlations with…

Descriptors: Scoring, Prompting, Evaluators, Computer Software

TOEFL iBT Speaking Test Scores as Indicators of Oral Communicative Language Proficiency

Peer reviewed

Direct link

Bridgeman, Brent; Powers, Donald; Stone, Elizabeth; Mollaun, Pamela – Language Testing, 2012

Scores assigned by trained raters and by an automated scoring system (SpeechRater[TM]) on the speaking section of the TOEFL iBT[TM] were validated against a communicative competence criterion. Specifically, a sample of 555 undergraduate students listened to speech samples from 184 examinees who took the Test of English as a Foreign Language…

Descriptors: Undergraduate Students, Speech Communication, Rating Scales, Scoring

Previous Page | Next Page »

Pages: 1 | 2

Xi, Xiaoming	4
Bridgeman, Brent	2
Mollaun, Pam	2
Mollaun, Pamela	2
Ahmadi Shirazi, Masoumeh	1
Ahmadi, Alireza	1
Allen, Laura K.	1
Arslan, Burcu	1
Attali, Yigal	1
Bejar, Isaac I.	1
Blanchard, Daniel	1
Cahill, Aoife	1
Casabianca, Jodi M.	1
Chan, Sathena	1
Chodorow, Martin	1
Crossley, Scott A.	1
Davey, Tim	1
Davis, Lawrence Edward	1
Finn, Bridgid	1
Gass, Susan	1
Gentile, Claudia	1
Guo, Liang	1
Heidari, Jamshid	1
Hemat, Ramin	1
Higgins, Derrick	1
More ▼