Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 10 |
Descriptor
Evaluators | 17 |
English (Second Language) | 9 |
Computer Assisted Testing | 8 |
Language Tests | 8 |
Scoring | 8 |
Second Language Learning | 8 |
Comparative Analysis | 5 |
Computer Software | 5 |
Interrater Reliability | 5 |
Scores | 5 |
Correlation | 4 |
More ▼ |
Source
ETS Research Report Series | 2 |
Language Assessment Quarterly | 2 |
Language Testing | 2 |
Assessment in Education:… | 1 |
English Language Teaching | 1 |
Innovation in Language… | 1 |
Journal of Research on… | 1 |
Author
Alegre, Analucia | 1 |
Arnold, Voiza | 1 |
Breyer, F. Jay | 1 |
Bridgeman, Brent | 1 |
Brossell, Gordon, Hoetker,… | 1 |
Crews, William E., Jr. | 1 |
Davis, Larry | 1 |
Eisenberg, Ann | 1 |
Goldberg, Gail Lynn | 1 |
Gu, Lin | 1 |
Hoang, Giang Thi Linh | 1 |
More ▼ |
Publication Type
Tests/Questionnaires | 17 |
Reports - Research | 15 |
Journal Articles | 10 |
Speeches/Meeting Papers | 3 |
Guides - Non-Classroom | 1 |
Reports - Evaluative | 1 |
Education Level
Higher Education | 6 |
Postsecondary Education | 5 |
Elementary Education | 1 |
Grade 5 | 1 |
High Schools | 1 |
Intermediate Grades | 1 |
Middle Schools | 1 |
Secondary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 4 |
Test of English for… | 1 |
What Works Clearinghouse Rating
Yuko Hayashi; Yusuke Kondo; Yutaka Ishii – Innovation in Language Learning and Teaching, 2024
Purpose: This study builds a new system for automatically assessing learners' speech elicited from an oral discourse completion task (DCT), and evaluates the prediction capability of the system with a view to better understanding factors deemed influential in predicting speaking proficiency scores and the pedagogical implications of the system.…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Japanese
Gu, Lin; Davis, Larry; Tao, Jacob; Zechner, Klaus – Assessment in Education: Principles, Policy & Practice, 2021
Recent technology advancements have increased the prospects for automated spoken language technology to provide feedback on speaking performance. In this study we examined user perceptions of using an automated feedback system for preparing for the TOEFL iBT® test. Test takers and language teachers evaluated three types of machine-generated…
Descriptors: Audio Equipment, Test Preparation, Feedback (Response), Scores
Linlin, Cao – English Language Teaching, 2020
Through Many-Facet Rasch analysis, this study explores the rating differences between 1 computer automatic rater and 5 expert teacher raters on scoring 119 students in a computerized English listening-speaking test. Results indicate that both automatic and the teacher raters demonstrate good inter-rater reliability, though the automatic rater…
Descriptors: Language Tests, Computer Assisted Testing, English (Second Language), Second Language Learning
Huang, Becky; Alegre, Analucia; Eisenberg, Ann – Language Assessment Quarterly, 2016
The project aimed to examine the effect of raters' familiarity with accents on their judgments of non-native speech. Participants included three groups of raters who were either from Spanish Heritage, Spanish Non-Heritage, or Chinese Heritage backgrounds (n = 16 in each group) using Winke & Gass's (2013) definition of a heritage learner as…
Descriptors: Contrastive Linguistics, Evaluators, Chinese, Spanish
Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013
In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…
Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests
Lee, Heather A. – Journal of Research on Christian Education, 2015
If Christian schools desire students to achieve higher-level thinking, then the textbooks that teachers use should reflect such thinking. Using Risner's (1987) methodology, raters classified questions from two Christian publishers' fifth grade reading textbooks based on the revised Bloom's taxonomy (Anderson et al., 2001). The questions in the A…
Descriptors: Religious Education, Christianity, Textbooks, Thinking Skills
Hoang, Giang Thi Linh; Kunnan, Antony John – Language Assessment Quarterly, 2016
Computer technology made its way into writing instruction and assessment with spelling and grammar checkers decades ago, but more recently it has done so with automated essay evaluation (AEE) and diagnostic feedback. And although many programs and tools have been developed in the last decade, not enough research has been conducted to support or…
Descriptors: Case Studies, Essays, Writing Evaluation, English (Second Language)
Jeong, Heejeong – Language Testing, 2013
Language assessment courses (LACs) are taught by professionals who have majored in the area of language testing (language testers or LTs), but also by others who come from different language-related majors (non-language testers, non-LTs). Different language assessment courses may be developed, depending on who teaches the course and the…
Descriptors: Language Tests, Courses, Teacher Education, Teacher Educators
Bridgeman, Brent; Powers, Donald; Stone, Elizabeth; Mollaun, Pamela – Language Testing, 2012
Scores assigned by trained raters and by an automated scoring system (SpeechRater[TM]) on the speaking section of the TOEFL iBT[TM] were validated against a communicative competence criterion. Specifically, a sample of 555 undergraduate students listened to speech samples from 184 examinees who took the Test of English as a Foreign Language…
Descriptors: Undergraduate Students, Speech Communication, Rating Scales, Scoring
Goldberg, Gail Lynn; Kapinus, Barbara – 1992
The Maryland School Performance Assessment Program (MSPAP) is a relatively new, statewide performance assessment of students in grades 3, 5, and 8. When first administered in May of 1991, the MSPAP included a battery of performance assessment tasks designed to generate written or drawn responses to reading texts. This study evaluated selected…
Descriptors: Comparative Testing, Elementary Education, Elementary School Teachers, Evaluators
Crews, William E., Jr. – 1991
As part of a study of teacher evaluation of student replies to open-ended questions, a second question--the best method of determining interrater reliability--was examined. The standard method, the Pearson Product-Moment correlation, overestimated the degree of match between researchers' and teachers' scoring of tests. The simpler percent…
Descriptors: Comparative Analysis, Elementary School Teachers, Evaluation Methods, Evaluators
Arnold, Voiza; And Others – 1990
In 1990, a study was conducted at Rio Hondo College (Whittier, California) to determine if readers exhibited any bias in scoring test papers that were composed on a word processor as opposed to being written by hand. The study began with the formulation of tentative pilot study questions and the development of procedures to address them. Three…
Descriptors: Bias, Community Colleges, Evaluators, Handwriting
Teddlie, Charles; And Others – 1990
The results are provided of an initial analysis of the reliability (generalizability) of the System for Teaching and Learning Assessment and Review (STAR) as a comprehensive measure of classroom teaching and learning for making teacher certification decisions. The STAR contains 140 indicators of teacher effectiveness and student learning, which…
Descriptors: Beginning Teachers, Classroom Observation Techniques, Elementary School Teachers, Elementary Secondary Education
Xi, Xiaoming; Mollaun, Pam – ETS Research Report Series, 2006
This study explores the utility of analytic scoring for the TOEFL® Academic Speaking Test (TAST) in providing useful and reliable diagnostic information in three aspects of candidates' performance: delivery, language use, and topic development. G studies were used to investigate the dependability of the analytic scores, the distinctness of the…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Oral Language
Shiflett, Samuel; And Others – 1985
A study was undertaken to improve the measurement of small team performance within the Army. A provisional taxonomy of team-level performance functions was field-validated; criteria and measures of the functions were developed; and their reliability was examined. The provisional taxonomy, used for observing Army field training exercises, was used…
Descriptors: Behavior Rating Scales, Classification, Evaluation Criteria, Evaluators
Previous Page | Next Page »
Pages: 1 | 2