NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)2
Since 2006 (last 20 years)11
Audience
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign…13
What Works Clearinghouse Rating
Showing all 13 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Davis, Larry – Language Testing, 2016
Two factors were investigated that are thought to contribute to consistency in rater scoring judgments: rater training and experience in scoring. Also considered were the relative effects of scoring rubrics and exemplars on rater performance. Experienced teachers of English (N = 20) scored recorded responses from the TOEFL iBT speaking test prior…
Descriptors: Evaluators, Oral Language, Scores, Language Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Weigle, Sara Cushing – ETS Research Report Series, 2011
Automated scoring has the potential to dramatically reduce the time and costs associated with the assessment of complex skills such as writing, but its use must be validated against a variety of criteria for it to be accepted by test users and stakeholders. This study addresses two validity-related issues regarding the use of e-rater® with the…
Descriptors: Scoring, English (Second Language), Second Language Instruction, Automation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ashwell, Tim; Elam, Jesse R. – JALT CALL Journal, 2017
The ultimate aim of our research project was to use the Google Web Speech API to automate scoring of elicited imitation (EI) tests. However, in order to achieve this goal, we had to take a number of preparatory steps. We needed to assess how accurate this speech recognition tool is in recognizing native speakers' production of the test items; we…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Language Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Yang, Hui-Chun – Language Assessment Quarterly, 2014
This study explores the construct of a summarization test task by means of single-group and multigroup structural equation modeling (SEM). It examines the interrelationships between strategy use and performance, drawing on data from 298 Taiwanese undergraduates' summary essays and their self-reported strategy use. Single-group SEM analyses…
Descriptors: Foreign Countries, Structural Equation Models, Writing Skills, Language Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Bratkovich, Meghan Odsliv – Working Papers in TESOL & Applied Linguistics, 2014
This study investigated the nature of self-assessment and blind peer- and teacher-assessment in L2 writing. The type of feedback students gave to themselves and peers, the type of feedback used in the revision process, and the source of the feedback used were all analyzed. Additionally, student perceptions of self- and peer-assessment, feedback,…
Descriptors: Student Evaluation, Evaluation Methods, Self Evaluation (Individuals), Peer Evaluation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.; Davey, Tim; Bridgeman, Brent – ETS Research Report Series, 2012
Scoring models for the "e-rater"® system were built and evaluated for the "TOEFL"® exam's independent and integrated writing prompts. Prompt-specific and generic scoring models were built, and evaluation statistics, such as weighted kappas, Pearson correlations, standardized differences in mean scores, and correlations with…
Descriptors: Scoring, Prompting, Evaluators, Computer Software
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Winke, Paula; Gass, Susan; Myford, Carol – ETS Research Report Series, 2011
This study investigated whether raters' second language (L2) background and the first language (L1) of test takers taking the TOEFL iBT® Speaking test were related through scoring. After an initial 4-hour training period, a group of 107 raters (mostly of learners of Chinese, Korean, and Spanish), listened to a selection of 432 speech samples that…
Descriptors: Second Language Learning, Evaluators, Speech Tests, English (Second Language)
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Jamieson, Joan; Poonpon, Kornwipa – ETS Research Report Series, 2013
Research and development of a new type of scoring rubric for the integrated speaking tasks of "TOEFL iBT"® are described. These "analytic rating guides" could be helpful if tasks modeled after those in TOEFL iBT were used for formative assessment, a purpose which is different from TOEFL iBT's primary use for admission…
Descriptors: Oral Language, Language Proficiency, Scaling, Scores
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Jernigan, Justin – English Language Teaching, 2012
Swain's Output Hypothesis proposes a facilitative effect for output on the acquisition of second language morphosyntax. In the context of classroom instruction, a number of studies and reviews suggest that explicit instruction in pragmatic elements promotes development. Other studies have offered less conclusive evidence of the effectiveness of…
Descriptors: English (Second Language), Second Language Instruction, Second Language Learning, Instructional Effectiveness
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lee, Yong-Won; Gentile, Claudia; Kantor, Robert – ETS Research Report Series, 2008
The main purpose of the study was to investigate the distinctness and reliability of analytic (or multitrait) rating dimensions and their relationships to holistic scores and "e-rater"® essay feature variables in the context of the TOEFL® computer-based test (CBT) writing assessment. Data analyzed in the study were analytic and holistic…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Scoring
Peer reviewed Peer reviewed
Wainer, Howard; Wang, Xiaohui – Journal of Educational Measurement, 2000
Modified the three-parameter model to include an additional random effect for items nested within the same testlet. Fitted the new model to 86 testlets from the Test of English as a Foreign Language (TOEFL) and compared standard parameters (discrimination, difficulty, and guessing) with those obtained through traditional modeling. Discusses the…
Descriptors: English (Second Language), Language Tests, Scoring, Statistical Analysis
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Cumming, Alister; Kantor, Robert; Baba, Kyoko; Eouanzoui, Keanre; Erdosy, Usman; James, Mark – ETS Research Report Series, 2006
We assessed whether and how the discourse written for prototype integrated tasks (involving writing in response to print or audio source texts) field tested for the new TOEFL® differs from the discourse written for independent essays (i.e., the TOEFL essay). We selected 216 compositions written for 6 tasks by 36 examinees in a field…
Descriptors: Discourse Analysis, Essays, Scores, Language Proficiency
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Des Brisay, Margaret – TESL Canada Journal, 1994
Data from the Canadian Test of English for Scholars and Trainees (CanTEST) are compared to data from the Test of English as a Foreign Language (TOEFL) to establish CanTEST as a valid admissions tool for English-as-a-Second Language college applicants. Data are taken from four groups of examinees who took both tests. (eight references) (LR)
Descriptors: Admission Criteria, Comparative Analysis, Comparative Testing, Correlation