ERIC - Search Results

Publication Date

In 2025	1
Since 2024	2
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	7
Since 2006 (last 20 years)	14

Source

ETS Research Report Series	4
Education Sciences	1
Education and Information…	1
English Language Teaching	1
Grantee Submission	1
Innovation in Language…	1
Journal of Education and…	1
Language Assessment Quarterly	1
Language Testing	1
RAND Europe	1
SAGE Open	1
More ▼

Publication Type

Tests/Questionnaires	18
Reports - Research	17
Journal Articles	13
Speeches/Meeting Papers	2
Reports - Evaluative	1

Education Level

Higher Education	7
Postsecondary Education	6
Early Childhood Education	1
Elementary Education	1
Elementary Secondary Education	1
High Schools	1
Kindergarten	1
Primary Education	1
Secondary Education	1

Audience

Researchers

Location

Japan	3
California	1
California (Los Angeles)	1
China	1
Iran	1
United Kingdom	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	6
International English…	1
National Assessment of…	1
Test of English for…	1
edTPA (Teacher Performance…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 18 results Save | Export

Scoring Difficulty in Summary Writing Assessment: Toward the Reconstruction of Analytic Rubric

Peer reviewed
PDF on ERIC

Download full text

Makiko Kato – Journal of Education and Learning, 2025

This study aims to examine whether differences exist in the factors influencing the difficulty of scoring English summaries and determining scores based on the raters' attributes, and to collect candid opinions, considerations, and tentative suggestions for future improvements to the analytic rubric of summary writing for English learners. In this…

Descriptors: Writing Evaluation, Scoring, Writing Skills, English (Second Language)

Automated Speech Scoring of Dialogue Response by Japanese Learners of English as a Foreign Language

Peer reviewed

Direct link

Yuko Hayashi; Yusuke Kondo; Yutaka Ishii – Innovation in Language Learning and Teaching, 2024

Purpose: This study builds a new system for automatically assessing learners' speech elicited from an oral discourse completion task (DCT), and evaluates the prediction capability of the system with a view to better understanding factors deemed influential in predicting speaking proficiency scores and the pedagogical implications of the system.…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Japanese

Low Inter-Rater Reliability of a High Stakes Performance Assessment of Teacher Candidates

Peer reviewed
PDF on ERIC

Download full text

Lyness, Scott A.; Peterson, Kent; Yates, Kenneth – Education Sciences, 2021

The Performance Assessment for California Teachers (PACT) is a high stakes summative assessment that was designed to measure pre-service teacher readiness. We examined the inter-rater reliability (IRR) of trained PACT evaluators who rated 19 candidates. As measured by Cohen's weighted kappa, the overall IRR estimate was 0.17 (poor strength of…

Descriptors: High Stakes Tests, Performance Based Assessment, Teacher Effectiveness, Academic Language

The Use of Semantic Similarity Tools in Automated Content Scoring of Fact-Based Essays Written by EFL Learners

Peer reviewed

Direct link

Wang, Qiao – Education and Information Technologies, 2022

This study searched for open-source semantic similarity tools and evaluated their effectiveness in automated content scoring of fact-based essays written by English-as-a-Foreign-Language (EFL) learners. Fifty writing samples under a fact-based writing task from an academic English course in a Japanese university were collected and a gold standard…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Scoring

Administrators' Uses of Teacher Observation Protocol in Different Rating Contexts. Research Report. ETS RR-18-18

Peer reviewed
PDF on ERIC

Download full text

Qi, Yi; Bell, Courtney A.; Jones, Nathan D.; Lewis, Jennifer M.; Witherspoon, Margaret W.; Redash, Amanda – ETS Research Report Series, 2018

Teacher observations are being used for high-stakes purposes in states across the country, and administrators often serve as raters in teacher evaluation systems. This paper examines how the cognitive aspects of administrators' use of an observation instrument, a modified version of Charlotte Danielson's Framework for Teaching, interact with the…

Descriptors: Teacher Evaluation, Classroom Observation Techniques, Observation, Evaluation Methods

For a Greater Good: Bias Analysis in Writing Assessment

Peer reviewed

Direct link

Ahmadi Shirazi, Masoumeh – SAGE Open, 2019

Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters' experience, language background, and the choice of essay prompt as potential…

Descriptors: Foreign Countries, Language Tests, Test Bias, Essay Tests

Comparison of Automatic and Expert Teachers' Rating of Computerized English Listening-Speaking Test

Peer reviewed
PDF on ERIC

Download full text

Linlin, Cao – English Language Teaching, 2020

Through Many-Facet Rasch analysis, this study explores the rating differences between 1 computer automatic rater and 5 expert teacher raters on scoring 119 students in a computerized English listening-speaking test. Results indicate that both automatic and the teacher raters demonstrate good inter-rater reliability, though the automatic rater…

Descriptors: Language Tests, Computer Assisted Testing, English (Second Language), Second Language Learning

Assessing Impact Submissions for REF 2014: An Evaluation

Direct link

Manville, Catriona; Guthrie, Susan; Henham, Marie-Louise; Garrod, Bryn; Sousa, Sonia; Kirtley, Anne; Castle-Clarke, Sophie; Ling, Tom – RAND Europe, 2015

The Research Excellence Framework (REF) is a new system for assessing the quality of research in UK higher education institutions (HEIs). For the first time, part of the assessment included the wider impact of research. RAND Europe was commissioned to evaluate the assessment process of the impact element of REF submissions, and to explore the…

Descriptors: Foreign Countries, Research, Higher Education, Outcome Measures

Investigating Differences between American and Indian Raters in Assessing TOEFL iBT Speaking Tasks

Peer reviewed

Direct link

Wei, Jing; Llosa, Lorena – Language Assessment Quarterly, 2015

This article reports on an investigation of the role raters' language background plays in raters' assessment of test takers' speaking ability. Specifically, this article examines differences between American and Indian raters in their scores and scoring processes when rating Indian test takers' responses to the Test of English as a Foreign…

Descriptors: North Americans, Indians, Evaluators, English (Second Language)

Linguistic Microfeatures to Predict L2 Writing Proficiency: A Case Study in Automated Writing Evaluation

Peer reviewed
PDF on ERIC

Download full text

Direct link

Crossley, Scott A.; Kyle, Kristopher; Allen, Laura K.; Guo, Liang; McNamara, Danielle S. – Grantee Submission, 2014

This study investigates the potential for linguistic microfeatures related to length, complexity, cohesion, relevance, topic, and rhetorical style to predict L2 writing proficiency. Computational indices were calculated by two automated text analysis tools (Coh- Metrix and the Writing Assessment Tool) and used to predict human essay ratings in a…

Descriptors: Computational Linguistics, Essays, Scoring, Writing Evaluation

TOEFL iBT Speaking Test Scores as Indicators of Oral Communicative Language Proficiency

Peer reviewed

Direct link

Bridgeman, Brent; Powers, Donald; Stone, Elizabeth; Mollaun, Pamela – Language Testing, 2012

Scores assigned by trained raters and by an automated scoring system (SpeechRater[TM]) on the speaking section of the TOEFL iBT[TM] were validated against a communicative competence criterion. Specifically, a sample of 555 undergraduate students listened to speech samples from 184 examinees who took the Test of English as a Foreign Language…

Descriptors: Undergraduate Students, Speech Communication, Rating Scales, Scoring

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

The Relationship between Raters' Prior Language Study and the Evaluation of Foreign Language Speech Samples. TOEFL iBT® Research Report. TOEFL iBT-16. ETS Research Report RR-11-30

Peer reviewed
PDF on ERIC

Download full text

Winke, Paula; Gass, Susan; Myford, Carol – ETS Research Report Series, 2011

This study investigated whether raters' second language (L2) background and the first language (L1) of test takers taking the TOEFL iBT® Speaking test were related through scoring. After an initial 4-hour training period, a group of 107 raters (mostly of learners of Chinese, Korean, and Spanish), listened to a selection of 432 speech samples that…

Descriptors: Second Language Learning, Evaluators, Speech Tests, English (Second Language)

Analysis of Interrater Reliability on the Evaluation of Answers to Open-Ended Questions.

Crews, William E., Jr. – 1991

As part of a study of teacher evaluation of student replies to open-ended questions, a second question--the best method of determining interrater reliability--was examined. The standard method, the Pearson Product-Moment correlation, overestimated the degree of match between researchers' and teachers' scoring of tests. The simpler percent…

Descriptors: Comparative Analysis, Elementary School Teachers, Evaluation Methods, Evaluators

Scoring Writing Samples in Educational Research: Selecting and Developing an Appropriate Procedure for Evaluating Elementary Student Writing.

Hawk, Anne W.; Cross, James Logan – 1987

This study involved the selection and adaptation of a writing assessment procedure for teachers and researchers in the Duval County Public Schools (Florida) to use in assessing changes in writing ability among elementary grade students. Through a review of the literature, four writing assessment procedures (analytic, holistic, focused holistic,…

Descriptors: Elementary Education, Elementary School Teachers, Evaluators, Holistic Evaluation

Previous Page | Next Page »

Pages: 1 | 2

Evaluators	18
Scoring	18
English (Second Language)	11
Second Language Learning	11
Language Tests	9
Interrater Reliability	8
Foreign Countries	7
Comparative Analysis	6
Writing Evaluation	6
Computer Assisted Testing	5
Computer Software	5
Questionnaires	5
Scores	5
College Faculty	4
Correlation	4
Evaluation Methods	4
Speech Communication	4
Elementary School Teachers	3
Teacher Attitudes	3
Testing Problems	3
Writing Skills	3
Administrators	2
Bias	2
College Students	2
Cues	2
More ▼

Ahmadi Shirazi, Masoumeh	1
Allen, Laura K.	1
Arnold, Voiza	1
Bell, Courtney A.	1
Breyer, F. Jay	1
Bridgeman, Brent	1
Castle-Clarke, Sophie	1
Crews, William E., Jr.	1
Cross, James Logan	1
Crossley, Scott A.	1
Garrod, Bryn	1
Gass, Susan	1
Goldberg, Gail Lynn	1
Guo, Liang	1
Guthrie, Susan	1
Hawk, Anne W.	1
Henham, Marie-Louise	1
Jones, Nathan D.	1
Kapinus, Barbara	1
Kirtley, Anne	1
Kyle, Kristopher	1
Lewis, Jennifer M.	1
Ling, Tom	1
Linlin, Cao	1
More ▼