Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 7 |
Since 2016 (last 10 years) | 14 |
Since 2006 (last 20 years) | 19 |
Descriptor
Source
Author
Ahmadi Safa, Mohammad | 1 |
Ahmadi Shirazi, Masoumeh | 1 |
Ahmadi, Alireza | 1 |
Archer, Jeff | 1 |
Beh-Afarin, Seyed Reza | 1 |
Berger, Cynthia M. | 1 |
Boyd, Victoria | 1 |
Breyer, F. Jay | 1 |
Canton, Ursula | 1 |
Cantrell, Steve | 1 |
Clarke, Laura | 1 |
More ▼ |
Publication Type
Tests/Questionnaires | 24 |
Reports - Research | 21 |
Journal Articles | 18 |
Speeches/Meeting Papers | 3 |
Books | 1 |
Guides - General | 1 |
Information Analyses | 1 |
Reports - Evaluative | 1 |
Education Level
Higher Education | 9 |
Postsecondary Education | 9 |
Elementary Education | 3 |
Secondary Education | 2 |
Early Childhood Education | 1 |
Elementary Secondary Education | 1 |
High Schools | 1 |
Kindergarten | 1 |
Primary Education | 1 |
Audience
Administrators | 1 |
Practitioners | 1 |
Researchers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 2 |
Flesch Kincaid Grade Level… | 1 |
International English… | 1 |
National Assessment of… | 1 |
Praxis Series | 1 |
edTPA (Teacher Performance… | 1 |
What Works Clearinghouse Rating
Zahn, Daniela; Canton, Ursula; Boyd, Victoria; Hamilton, Laura; Mamo, Josianne; McKay, Jane; Proudfoot, Linda; Telfer, Dickson; Williams, Kim; Wilson, Colin – Studies in Higher Education, 2021
Evaluating the impact of Academic Literacies teaching (Lea and Street [1998. "Student Writing in Higher Education: An Academic Literacies Approach." "Studies in Higher Education" 23 (2): 157-72. doi:10.1080/03075079812331380364]) is difficult, as it involves gauging whether writers: (1) gain better understanding of what…
Descriptors: Writing Evaluation, Evaluation Methods, Undergraduate Students, Foreign Countries
Hunter, Seth B. – Journal of Education Human Resources, 2023
Teacher performance scores inform education leaders' management of teacher human resources. However, prior research has implied that different interpretations of performance criteria between teachers and their evaluators suppress teacher development. Although research has examined teacher perceptions of performance scores and compared teacher…
Descriptors: Teacher Evaluation, Teacher Effectiveness, Self Evaluation (Individuals), Interrater Reliability
Li, Wentao – Reading and Writing: An Interdisciplinary Journal, 2022
Scoring rubrics are known to be effective for assessing writing for both testing and classroom teaching purposes. How raters interpret the descriptors in a rubric can significantly impact the subsequent final score, and further, the descriptors may also color a rater's judgment of a student's writing quality. Little is known, however, about how…
Descriptors: Scoring Rubrics, Interrater Reliability, Writing Evaluation, Teaching Methods
Doosti, Mehdi; Ahmadi Safa, Mohammad – International Journal of Language Testing, 2021
This study examined the effect of rater training on promoting inter-rater reliability in oral language assessment. It also investigated whether rater training and the consideration of the examinees' expectations by the examiners have any effect on test-takers' perceptions of being fairly evaluated. To this end, four raters scored 31 Iranian…
Descriptors: Oral Language, Language Tests, Interrater Reliability, Training
Karusoo-Musumeci, Ava; Pearce, Wendy M.; Donaghy, Michelle – Child Language Teaching and Therapy, 2022
Oral narrative assessments are important for diagnosis of language disorders in school-age children so scoring needs to be reliable and consistent. This study explored the impact of training on the variability of story grammar scores in children's oral narrative assessments scored by multiple raters. Fifty-one speech pathologists and 19 final-year…
Descriptors: Oral Language, Speech Evaluation, Language Impairments, Elementary School Students
Lyness, Scott A.; Peterson, Kent; Yates, Kenneth – Education Sciences, 2021
The Performance Assessment for California Teachers (PACT) is a high stakes summative assessment that was designed to measure pre-service teacher readiness. We examined the inter-rater reliability (IRR) of trained PACT evaluators who rated 19 candidates. As measured by Cohen's weighted kappa, the overall IRR estimate was 0.17 (poor strength of…
Descriptors: High Stakes Tests, Performance Based Assessment, Teacher Effectiveness, Academic Language
Wang, Qiao – Education and Information Technologies, 2022
This study searched for open-source semantic similarity tools and evaluated their effectiveness in automated content scoring of fact-based essays written by English-as-a-Foreign-Language (EFL) learners. Fifty writing samples under a fact-based writing task from an academic English course in a Japanese university were collected and a gold standard…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Scoring
Jeong, Heejeong – Language Testing in Asia, 2019
In writing assessment, finding a valid, reliable, and efficient scale is critical. Appropriate scales, increase rater reliability, and can also save time and money. This exploratory study compared the effects of a binary scale and an analytic scale across teacher raters and expert raters. The purpose of the study is to find out how different scale…
Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction
Ahmadi Shirazi, Masoumeh – SAGE Open, 2019
Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters' experience, language background, and the choice of essay prompt as potential…
Descriptors: Foreign Countries, Language Tests, Test Bias, Essay Tests
Linlin, Cao – English Language Teaching, 2020
Through Many-Facet Rasch analysis, this study explores the rating differences between 1 computer automatic rater and 5 expert teacher raters on scoring 119 students in a computerized English listening-speaking test. Results indicate that both automatic and the teacher raters demonstrate good inter-rater reliability, though the automatic rater…
Descriptors: Language Tests, Computer Assisted Testing, English (Second Language), Second Language Learning
Archer, Jeff; Cantrell, Steve; Holtzman, Steven L.; Joe, Jilliam N.; Tocci, Cynthia M.; Wood, Jess – Bill & Melinda Gates Foundation, 2016
In this book the authors explain how to build, and over time improve, the elements of an observation system that equips all observers to identify and develop effective teaching. It is based on the collective knowledge of key partners in the Measures of Effective Teaching (MET) project--which carried out one of the largest-ever studies of classroom…
Descriptors: Feedback (Response), Teacher Effectiveness, Observation, Teacher Evaluation
Kuiken, Folkert; Vedder, Ineke – Language Testing, 2017
The importance of functional adequacy as an essential component of L2 proficiency has been observed by several authors (Pallotti, 2009; De Jong, Steinel, Florijn, Schoonen, & Hulstijn, 2012a, b). The rationale underlying the present study is that the assessment of writing proficiency in L2 is not fully possible without taking into account the…
Descriptors: Second Language Learning, Rating Scales, Computational Linguistics, Persuasive Discourse
Skalicky, Stephen; Berger, Cynthia M.; Crossley, Scott A.; McNamara, Danielle S. – Advances in Language and Literary Studies, 2016
A corpus of 313 freshman college essays was analyzed in order to better understand the forms and functions of humor in academic writing. Human ratings of humor and wordplay were statistically aggregated using Factor Analysis to provide an overall "Humor" component score for each essay in the corpus. In addition, the essays were also…
Descriptors: Discourse Analysis, Academic Discourse, Humor, Writing (Composition)
Ahmadi, Alireza; Sadeghi, Elham – Language Assessment Quarterly, 2016
In the present study we investigated the effect of test format on oral performance in terms of test scores and discourse features (accuracy, fluency, and complexity). Moreover, we explored how the scores obtained on different test formats relate to such features. To this end, 23 Iranian EFL learners participated in three test formats of monologue,…
Descriptors: Oral Language, Comparative Analysis, Language Fluency, Accuracy
Pufpaff, Lisa A.; Clarke, Laura; Jones, Ruth E. – Mid-Western Educational Researcher, 2015
This paper addresses the effects of rater training on the rubric-based scoring of three preservice teacher candidate performance assessments. This project sought to evaluate the consistency of ratings assigned to student learning outcome measures being used for program accreditation and to explore the need for rater training in order to increase…
Descriptors: Evaluators, Interrater Reliability, Preservice Teachers, Scoring Rubrics
Previous Page | Next Page ยป
Pages: 1 | 2