Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 6 |
Since 2006 (last 20 years) | 14 |
Descriptor
English (Second Language) | 14 |
Evaluators | 14 |
Language Tests | 14 |
Scores | 14 |
Second Language Learning | 14 |
Computer Assisted Testing | 9 |
Scoring | 7 |
Correlation | 6 |
Interrater Reliability | 6 |
Language Proficiency | 6 |
Oral Language | 6 |
More ▼ |
Source
Author
Davis, Larry | 2 |
Xi, Xiaoming | 2 |
Ahmadi Shirazi, Masoumeh | 1 |
Attali, Yigal | 1 |
Blanchard, Daniel | 1 |
Cahill, Aoife | 1 |
Chodorow, Martin | 1 |
Choi, Jin Soo | 1 |
Clevinger, Amanda | 1 |
Crossley, Scott | 1 |
Davis, Lawrence Edward | 1 |
More ▼ |
Publication Type
Journal Articles | 13 |
Reports - Research | 13 |
Tests/Questionnaires | 4 |
Dissertations/Theses -… | 1 |
Education Level
Higher Education | 4 |
Postsecondary Education | 4 |
Audience
Location
Iran | 2 |
Japan (Tokyo) | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 14 |
International English… | 2 |
Test of English for… | 2 |
Foreign Language Classroom… | 1 |
What Works Clearinghouse Rating
Karim Sadeghi; Neda Bakhshi – International Journal of Language Testing, 2025
Assessing language skills in an integrative form has drawn the attention of assessment experts in recent years. While some research data exists on integrative listening/reading-to-write assessment, there is comparatively little research literature on listening-to-speak integrated assessment. Also, little attention has been devoted to the role of…
Descriptors: Language Tests, Second Language Learning, English (Second Language), Computer Assisted Testing
Choi, Jin Soo – Applied Language Learning, 2021
This study examined the impact of the manipulated task complexity (Robinson 2001a, 2001b, 2007, 2011; Robinson & Gilabert, 2007) on second language (L2) speech comprehensibility. I examined whether manipulated task complexity (a) impacts L2 speech comprehensibility, (b) aligns with L2 speakers' perception of task difficulty (cognitive…
Descriptors: Task Analysis, Second Language Learning, Second Language Instruction, Pronunciation
Ahmadi Shirazi, Masoumeh – SAGE Open, 2019
Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters' experience, language background, and the choice of essay prompt as potential…
Descriptors: Foreign Countries, Language Tests, Test Bias, Essay Tests
Gu, Lin; Davis, Larry; Tao, Jacob; Zechner, Klaus – Assessment in Education: Principles, Policy & Practice, 2021
Recent technology advancements have increased the prospects for automated spoken language technology to provide feedback on speaking performance. In this study we examined user perceptions of using an automated feedback system for preparing for the TOEFL iBT® test. Test takers and language teachers evaluated three types of machine-generated…
Descriptors: Audio Equipment, Test Preparation, Feedback (Response), Scores
Davis, Larry – Language Testing, 2016
Two factors were investigated that are thought to contribute to consistency in rater scoring judgments: rater training and experience in scoring. Also considered were the relative effects of scoring rubrics and exemplars on rater performance. Experienced teachers of English (N = 20) scored recorded responses from the TOEFL iBT speaking test prior…
Descriptors: Evaluators, Oral Language, Scores, Language Tests
Negishi, Junko – Journal of Pan-Pacific Association of Applied Linguistics, 2015
The study considers the assessment of L2 English learners by trained raters in paired and group oral assessments in comparison to an individual, monologue assessment, to determine 1) the degree to which raters assign pairs/groups shared (the same) scores and the degree to which raters give individual members of pairs/groups higher or lower as…
Descriptors: Evaluators, English (Second Language), Second Language Learning, Scores
Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015
The "e-rater"® automated essay scoring system is used operationally in the scoring of "TOEFL iBT"® independent and integrated tasks. In this study we explored the psychometric added value of reporting four trait scores for each of these two tasks, beyond the total e-rater score.The four trait scores are word choice, grammatical…
Descriptors: Writing Tests, Scores, Language Tests, English (Second Language)
Kang, Okim; Vo, Son Ca Thanh; Moran, Meghan Kerry – TESL-EJ, 2016
Research in second language speech has often focused on listeners' accent judgment and factors that affect their perception. However, the topic of listeners' application of specific sound categories in their own perceptual judgments has not been widely investigated. The current study explored how listeners from diverse language backgrounds weighed…
Descriptors: Pronunciation, Phonology, English (Second Language), Second Language Learning
Crossley, Scott; Clevinger, Amanda; Kim, YouJin – Language Assessment Quarterly, 2014
There has been a growing interest in the use of integrated tasks in the field of second language testing to enhance the authenticity of language tests. However, the role of text integration in test takers' performance has not been widely investigated. The purpose of the current study is to examine the effects of text-based relational (i.e.,…
Descriptors: Language Proficiency, Connected Discourse, Language Tests, English (Second Language)
Blanchard, Daniel; Tetreault, Joel; Higgins, Derrick; Cahill, Aoife; Chodorow, Martin – ETS Research Report Series, 2013
This report presents work on the development of a new corpus of non-native English writing. It will be useful for the task of native language identification, as well as grammatical error detection and correction, and automatic essay scoring. In this report, the corpus is described in detail.
Descriptors: Language Tests, Second Language Learning, English (Second Language), Writing Tests
Davis, Lawrence Edward – ProQuest LLC, 2012
Speaking performance tests typically employ raters to produce scores; accordingly, variability in raters' scoring decisions has important consequences for test reliability and validity. One such source of variability is the rater's level of expertise in scoring. Therefore, it is important to understand how raters' performance is influenced by…
Descriptors: Evaluators, Expertise, Scores, Second Language Learning
Jamieson, Joan; Poonpon, Kornwipa – ETS Research Report Series, 2013
Research and development of a new type of scoring rubric for the integrated speaking tasks of "TOEFL iBT"® are described. These "analytic rating guides" could be helpful if tasks modeled after those in TOEFL iBT were used for formative assessment, a purpose which is different from TOEFL iBT's primary use for admission…
Descriptors: Oral Language, Language Proficiency, Scaling, Scores
Xi, Xiaoming – Language Testing, 2007
This study explores the utility of analytic scoring for TAST in providing useful and reliable diagnostic information for operational use in three aspects of candidates' performance: delivery, language use and topic development. One hundred and forty examinees' responses to six TAST tasks were scored analytically on these three aspects of speech. G…
Descriptors: Scoring, Profiles, Performance Based Assessment, Academic Discourse
Xi, Xiaoming; Mollaun, Pam – ETS Research Report Series, 2006
This study explores the utility of analytic scoring for the TOEFL® Academic Speaking Test (TAST) in providing useful and reliable diagnostic information in three aspects of candidates' performance: delivery, language use, and topic development. G studies were used to investigate the dependability of the analytic scores, the distinctness of the…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Oral Language