Publication Date
In 2025 | 1 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 14 |
Descriptor
Source
Author
Publication Type
Tests/Questionnaires | 18 |
Reports - Research | 17 |
Journal Articles | 13 |
Speeches/Meeting Papers | 2 |
Reports - Evaluative | 1 |
Education Level
Higher Education | 7 |
Postsecondary Education | 6 |
Early Childhood Education | 1 |
Elementary Education | 1 |
Elementary Secondary Education | 1 |
High Schools | 1 |
Kindergarten | 1 |
Primary Education | 1 |
Secondary Education | 1 |
Audience
Researchers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 6 |
International English… | 1 |
National Assessment of… | 1 |
Test of English for… | 1 |
edTPA (Teacher Performance… | 1 |
What Works Clearinghouse Rating
Makiko Kato – Journal of Education and Learning, 2025
This study aims to examine whether differences exist in the factors influencing the difficulty of scoring English summaries and determining scores based on the raters' attributes, and to collect candid opinions, considerations, and tentative suggestions for future improvements to the analytic rubric of summary writing for English learners. In this…
Descriptors: Writing Evaluation, Scoring, Writing Skills, English (Second Language)
Yuko Hayashi; Yusuke Kondo; Yutaka Ishii – Innovation in Language Learning and Teaching, 2024
Purpose: This study builds a new system for automatically assessing learners' speech elicited from an oral discourse completion task (DCT), and evaluates the prediction capability of the system with a view to better understanding factors deemed influential in predicting speaking proficiency scores and the pedagogical implications of the system.…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Japanese
Lyness, Scott A.; Peterson, Kent; Yates, Kenneth – Education Sciences, 2021
The Performance Assessment for California Teachers (PACT) is a high stakes summative assessment that was designed to measure pre-service teacher readiness. We examined the inter-rater reliability (IRR) of trained PACT evaluators who rated 19 candidates. As measured by Cohen's weighted kappa, the overall IRR estimate was 0.17 (poor strength of…
Descriptors: High Stakes Tests, Performance Based Assessment, Teacher Effectiveness, Academic Language
Wang, Qiao – Education and Information Technologies, 2022
This study searched for open-source semantic similarity tools and evaluated their effectiveness in automated content scoring of fact-based essays written by English-as-a-Foreign-Language (EFL) learners. Fifty writing samples under a fact-based writing task from an academic English course in a Japanese university were collected and a gold standard…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Scoring
Qi, Yi; Bell, Courtney A.; Jones, Nathan D.; Lewis, Jennifer M.; Witherspoon, Margaret W.; Redash, Amanda – ETS Research Report Series, 2018
Teacher observations are being used for high-stakes purposes in states across the country, and administrators often serve as raters in teacher evaluation systems. This paper examines how the cognitive aspects of administrators' use of an observation instrument, a modified version of Charlotte Danielson's Framework for Teaching, interact with the…
Descriptors: Teacher Evaluation, Classroom Observation Techniques, Observation, Evaluation Methods
Ahmadi Shirazi, Masoumeh – SAGE Open, 2019
Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters' experience, language background, and the choice of essay prompt as potential…
Descriptors: Foreign Countries, Language Tests, Test Bias, Essay Tests
Linlin, Cao – English Language Teaching, 2020
Through Many-Facet Rasch analysis, this study explores the rating differences between 1 computer automatic rater and 5 expert teacher raters on scoring 119 students in a computerized English listening-speaking test. Results indicate that both automatic and the teacher raters demonstrate good inter-rater reliability, though the automatic rater…
Descriptors: Language Tests, Computer Assisted Testing, English (Second Language), Second Language Learning
Manville, Catriona; Guthrie, Susan; Henham, Marie-Louise; Garrod, Bryn; Sousa, Sonia; Kirtley, Anne; Castle-Clarke, Sophie; Ling, Tom – RAND Europe, 2015
The Research Excellence Framework (REF) is a new system for assessing the quality of research in UK higher education institutions (HEIs). For the first time, part of the assessment included the wider impact of research. RAND Europe was commissioned to evaluate the assessment process of the impact element of REF submissions, and to explore the…
Descriptors: Foreign Countries, Research, Higher Education, Outcome Measures
Wei, Jing; Llosa, Lorena – Language Assessment Quarterly, 2015
This article reports on an investigation of the role raters' language background plays in raters' assessment of test takers' speaking ability. Specifically, this article examines differences between American and Indian raters in their scores and scoring processes when rating Indian test takers' responses to the Test of English as a Foreign…
Descriptors: North Americans, Indians, Evaluators, English (Second Language)
Crossley, Scott A.; Kyle, Kristopher; Allen, Laura K.; Guo, Liang; McNamara, Danielle S. – Grantee Submission, 2014
This study investigates the potential for linguistic microfeatures related to length, complexity, cohesion, relevance, topic, and rhetorical style to predict L2 writing proficiency. Computational indices were calculated by two automated text analysis tools (Coh- Metrix and the Writing Assessment Tool) and used to predict human essay ratings in a…
Descriptors: Computational Linguistics, Essays, Scoring, Writing Evaluation
Bridgeman, Brent; Powers, Donald; Stone, Elizabeth; Mollaun, Pamela – Language Testing, 2012
Scores assigned by trained raters and by an automated scoring system (SpeechRater[TM]) on the speaking section of the TOEFL iBT[TM] were validated against a communicative competence criterion. Specifically, a sample of 555 undergraduate students listened to speech samples from 184 examinees who took the Test of English as a Foreign Language…
Descriptors: Undergraduate Students, Speech Communication, Rating Scales, Scoring
Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013
In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…
Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests
Winke, Paula; Gass, Susan; Myford, Carol – ETS Research Report Series, 2011
This study investigated whether raters' second language (L2) background and the first language (L1) of test takers taking the TOEFL iBT® Speaking test were related through scoring. After an initial 4-hour training period, a group of 107 raters (mostly of learners of Chinese, Korean, and Spanish), listened to a selection of 432 speech samples that…
Descriptors: Second Language Learning, Evaluators, Speech Tests, English (Second Language)
Crews, William E., Jr. – 1991
As part of a study of teacher evaluation of student replies to open-ended questions, a second question--the best method of determining interrater reliability--was examined. The standard method, the Pearson Product-Moment correlation, overestimated the degree of match between researchers' and teachers' scoring of tests. The simpler percent…
Descriptors: Comparative Analysis, Elementary School Teachers, Evaluation Methods, Evaluators
Hawk, Anne W.; Cross, James Logan – 1987
This study involved the selection and adaptation of a writing assessment procedure for teachers and researchers in the Duval County Public Schools (Florida) to use in assessing changes in writing ability among elementary grade students. Through a review of the literature, four writing assessment procedures (analytic, holistic, focused holistic,…
Descriptors: Elementary Education, Elementary School Teachers, Evaluators, Holistic Evaluation
Previous Page | Next Page »
Pages: 1 | 2