ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	7
Since 2016 (last 10 years)	14
Since 2006 (last 20 years)	19

Descriptor

Evaluators	24
Interrater Reliability	24
English (Second Language)	11
Second Language Learning	11
Foreign Countries	9
Evaluation Methods	8
Scores	8
Scoring	8
Writing Evaluation	8
Second Language Instruction	7
Comparative Analysis	6
Correlation	6
Language Tests	6
Undergraduate Students	6
Oral Language	5
Scoring Rubrics	5
Statistical Analysis	5
Feedback (Response)	4
College Faculty	3
Computer Assisted Testing	3
Computer Software	3
Elementary School Teachers	3
Essays	3
Evaluation Criteria	3
Rating Scales	3
More ▼

Source

ETS Research Report Series	2
English Language Teaching	2
Advances in Language and…	1
Bill & Melinda Gates…	1
Child Language Teaching and…	1
Education Sciences	1
Education and Information…	1
International Journal of…	1
Journal of Education Human…	1
Language Assessment Quarterly	1
Language Testing	1
Language Testing in Asia	1
Mid-Western Educational…	1
Online Submission	1
Reading and Writing: An…	1
SAGE Open	1
Studies in Higher Education	1
More ▼

Publication Type

Tests/Questionnaires	24
Reports - Research	21
Journal Articles	18
Speeches/Meeting Papers	3
Books	1
Guides - General	1
Information Analyses	1
Reports - Evaluative	1

Education Level

Higher Education	9
Postsecondary Education	9
Elementary Education	3
Secondary Education	2
Early Childhood Education	1
Elementary Secondary Education	1
High Schools	1
Kindergarten	1
Primary Education	1

Audience

Administrators	1
Practitioners	1
Researchers	1

Location

Iran	2
Japan	2
Australia	1
California	1
China	1
Iran (Tehran)	1
Tennessee	1
United Kingdom	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	2
Flesch Kincaid Grade Level…	1
International English…	1
National Assessment of…	1
Praxis Series	1
edTPA (Teacher Performance…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 24 results Save | Export

The Whole Is More than the Sum of Its Parts -- Assessing Writing Using the Consensual Assessment Technique

Peer reviewed

Direct link

Zahn, Daniela; Canton, Ursula; Boyd, Victoria; Hamilton, Laura; Mamo, Josianne; McKay, Jane; Proudfoot, Linda; Telfer, Dickson; Williams, Kim; Wilson, Colin – Studies in Higher Education, 2021

Evaluating the impact of Academic Literacies teaching (Lea and Street [1998. "Student Writing in Higher Education: An Academic Literacies Approach." "Studies in Higher Education" 23 (2): 157-72. doi:10.1080/03075079812331380364]) is difficult, as it involves gauging whether writers: (1) gain better understanding of what…

Descriptors: Writing Evaluation, Evaluation Methods, Undergraduate Students, Foreign Countries

Do You Mean What I Mean? Comparing Teacher Performance Self-Scores and Evaluator-Generated Scores

Peer reviewed

Direct link

Hunter, Seth B. – Journal of Education Human Resources, 2023

Teacher performance scores inform education leaders' management of teacher human resources. However, prior research has implied that different interpretations of performance criteria between teachers and their evaluators suppress teacher development. Although research has examined teacher perceptions of performance scores and compared teacher…

Descriptors: Teacher Evaluation, Teacher Effectiveness, Self Evaluation (Individuals), Interrater Reliability

Scoring Rubric Reliability and Internal Validity in Rater-Mediated EFL Writing Assessment: Insights from Many-Facet Rasch Measurement

Peer reviewed

Direct link

Li, Wentao – Reading and Writing: An Interdisciplinary Journal, 2022

Scoring rubrics are known to be effective for assessing writing for both testing and classroom teaching purposes. How raters interpret the descriptors in a rubric can significantly impact the subsequent final score, and further, the descriptors may also color a rater's judgment of a student's writing quality. Little is known, however, about how…

Descriptors: Scoring Rubrics, Interrater Reliability, Writing Evaluation, Teaching Methods

Fairness in Oral Language Assessment: Training Raters and Considering Examinees' Expectations

Peer reviewed
PDF on ERIC

Download full text

Doosti, Mehdi; Ahmadi Safa, Mohammad – International Journal of Language Testing, 2021

This study examined the effect of rater training on promoting inter-rater reliability in oral language assessment. It also investigated whether rater training and the consideration of the examinees' expectations by the examiners have any effect on test-takers' perceptions of being fairly evaluated. To this end, four raters scored 31 Iranian…

Descriptors: Oral Language, Language Tests, Interrater Reliability, Training

The Effect of Workshop Training on Rater Variability in Children's Oral Narrative Assessment

Peer reviewed

Direct link

Karusoo-Musumeci, Ava; Pearce, Wendy M.; Donaghy, Michelle – Child Language Teaching and Therapy, 2022

Oral narrative assessments are important for diagnosis of language disorders in school-age children so scoring needs to be reliable and consistent. This study explored the impact of training on the variability of story grammar scores in children's oral narrative assessments scored by multiple raters. Fifty-one speech pathologists and 19 final-year…

Descriptors: Oral Language, Speech Evaluation, Language Impairments, Elementary School Students

Low Inter-Rater Reliability of a High Stakes Performance Assessment of Teacher Candidates

Peer reviewed
PDF on ERIC

Download full text

Lyness, Scott A.; Peterson, Kent; Yates, Kenneth – Education Sciences, 2021

The Performance Assessment for California Teachers (PACT) is a high stakes summative assessment that was designed to measure pre-service teacher readiness. We examined the inter-rater reliability (IRR) of trained PACT evaluators who rated 19 candidates. As measured by Cohen's weighted kappa, the overall IRR estimate was 0.17 (poor strength of…

Descriptors: High Stakes Tests, Performance Based Assessment, Teacher Effectiveness, Academic Language

The Use of Semantic Similarity Tools in Automated Content Scoring of Fact-Based Essays Written by EFL Learners

Peer reviewed

Direct link

Wang, Qiao – Education and Information Technologies, 2022

This study searched for open-source semantic similarity tools and evaluated their effectiveness in automated content scoring of fact-based essays written by English-as-a-Foreign-Language (EFL) learners. Fifty writing samples under a fact-based writing task from an academic English course in a Japanese university were collected and a gold standard…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Scoring

Writing Scale Effects on Raters: An Exploratory Study

Peer reviewed

Direct link

Jeong, Heejeong – Language Testing in Asia, 2019

In writing assessment, finding a valid, reliable, and efficient scale is critical. Appropriate scales, increase rater reliability, and can also save time and money. This exploratory study compared the effects of a binary scale and an analytic scale across teacher raters and expert raters. The purpose of the study is to find out how different scale…

Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction

For a Greater Good: Bias Analysis in Writing Assessment

Peer reviewed

Direct link

Ahmadi Shirazi, Masoumeh – SAGE Open, 2019

Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters' experience, language background, and the choice of essay prompt as potential…

Descriptors: Foreign Countries, Language Tests, Test Bias, Essay Tests

Comparison of Automatic and Expert Teachers' Rating of Computerized English Listening-Speaking Test

Peer reviewed
PDF on ERIC

Download full text

Linlin, Cao – English Language Teaching, 2020

Through Many-Facet Rasch analysis, this study explores the rating differences between 1 computer automatic rater and 5 expert teacher raters on scoring 119 students in a computerized English listening-speaking test. Results indicate that both automatic and the teacher raters demonstrate good inter-rater reliability, though the automatic rater…

Descriptors: Language Tests, Computer Assisted Testing, English (Second Language), Second Language Learning

Better Feedback for Better Teaching: A Practical Guide to Improving Classroom Observations

Download full text

Archer, Jeff; Cantrell, Steve; Holtzman, Steven L.; Joe, Jilliam N.; Tocci, Cynthia M.; Wood, Jess – Bill & Melinda Gates Foundation, 2016

In this book the authors explain how to build, and over time improve, the elements of an observation system that equips all observers to identify and develop effective teaching. It is based on the collective knowledge of key partners in the Measures of Effective Teaching (MET) project--which carried out one of the largest-ever studies of classroom…

Descriptors: Feedback (Response), Teacher Effectiveness, Observation, Teacher Evaluation

Functional Adequacy in L2 Writing: Towards a New Rating Scale

Peer reviewed

Direct link

Kuiken, Folkert; Vedder, Ineke – Language Testing, 2017

The importance of functional adequacy as an essential component of L2 proficiency has been observed by several authors (Pallotti, 2009; De Jong, Steinel, Florijn, Schoonen, & Hulstijn, 2012a, b). The rationale underlying the present study is that the assessment of writing proficiency in L2 is not fully possible without taking into account the…

Descriptors: Second Language Learning, Rating Scales, Computational Linguistics, Persuasive Discourse

Linguistic Features of Humor in Academic Writing

Peer reviewed
PDF on ERIC

Download full text

Skalicky, Stephen; Berger, Cynthia M.; Crossley, Scott A.; McNamara, Danielle S. – Advances in Language and Literary Studies, 2016

A corpus of 313 freshman college essays was analyzed in order to better understand the forms and functions of humor in academic writing. Human ratings of humor and wordplay were statistically aggregated using Factor Analysis to provide an overall "Humor" component score for each essay in the corpus. In addition, the essays were also…

Descriptors: Discourse Analysis, Academic Discourse, Humor, Writing (Composition)

Assessing English Language Learners' Oral Performance: A Comparison of Monologue, Interview, and Group Oral Test

Peer reviewed

Direct link

Ahmadi, Alireza; Sadeghi, Elham – Language Assessment Quarterly, 2016

In the present study we investigated the effect of test format on oral performance in terms of test scores and discourse features (accuracy, fluency, and complexity). Moreover, we explored how the scores obtained on different test formats relate to such features. To this end, 23 Iranian EFL learners participated in three test formats of monologue,…

Descriptors: Oral Language, Comparative Analysis, Language Fluency, Accuracy

The Effects of Rater Training on Inter-Rater Agreement

Peer reviewed

Direct link

Pufpaff, Lisa A.; Clarke, Laura; Jones, Ruth E. – Mid-Western Educational Researcher, 2015

This paper addresses the effects of rater training on the rubric-based scoring of three preservice teacher candidate performance assessments. This project sought to evaluate the consistency of ratings assigned to student learning outcome measures being used for program accreditation and to explore the need for rater training in order to increase…

Descriptors: Evaluators, Interrater Reliability, Preservice Teachers, Scoring Rubrics

Previous Page | Next Page »

Pages: 1 | 2

Ahmadi Safa, Mohammad	1
Ahmadi Shirazi, Masoumeh	1
Ahmadi, Alireza	1
Archer, Jeff	1
Beh-Afarin, Seyed Reza	1
Berger, Cynthia M.	1
Boyd, Victoria	1
Breyer, F. Jay	1
Canton, Ursula	1
Cantrell, Steve	1
Clarke, Laura	1
Crews, William E., Jr.	1
Cross, James Logan	1
Crossley, Scott A.	1
Donaghy, Michelle	1
Doosti, Mehdi	1
Hamilton, Laura	1
Hawk, Anne W.	1
Hijikata-Someya, Yuko	1
Holtzman, Steven L.	1
Hunter, Seth B.	1
Jeong, Heejeong	1
Joe, Jilliam N.	1
Jones, Ruth E.	1
More ▼