NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 49 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Jin, Kuan-Yu; Eckes, Thomas – Measurement: Interdisciplinary Research and Perspectives, 2022
Recent research on rater effects in performance assessments has increasingly focused on rater centrality, the tendency to assign scores clustering around the rating scale's middle categories. In the present paper, we adopted Jin and Wang's (2018) extended facets modeling approach and constructed a centrality continuum, ranging from raters…
Descriptors: Performance Based Assessment, Evaluators, Scoring, Sample Size
Peer reviewed Peer reviewed
Direct linkDirect link
Lestari, Santi B.; Brunfaut, Tineke – Language Testing, 2023
Assessing integrated reading-into-writing task performances is known to be challenging, and analytic rating scales have been found to better facilitate the scoring of these performances than other common types of rating scales. However, little is known about how specific operationalizations of the reading-into-writing construct in analytic rating…
Descriptors: Reading Writing Relationship, Writing Tests, Rating Scales, Writing Processes
Peer reviewed Peer reviewed
Direct linkDirect link
Huang, Jing; Chen, Gaowei – AERA Online Paper Repository, 2019
This research investigates the effects of rater experience on performance ratings in language testing using a systematic review of studies published from 1985 to 2017. Based on a comprehensive literature search of 14 databases, we identified sixteen relevant papers. With these we conducted a narrative review to conceptualize a theoretical…
Descriptors: Language Tests, Experience, Evaluators, Performance Based Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Choi, Ikkyu; Wolfe, Edward W. – Applied Measurement in Education, 2020
Rater training is essential in ensuring the quality of constructed response scoring. Most of the current knowledge about rater training comes from experimental contexts with an emphasis on short-term effects. Few sources are available for empirical evidence on whether and how raters become more accurate as they gain scoring experiences or what…
Descriptors: Scoring, Accuracy, Training, Evaluators
Peer reviewed Peer reviewed
Direct linkDirect link
Yuko Hayashi; Yusuke Kondo; Yutaka Ishii – Innovation in Language Learning and Teaching, 2024
Purpose: This study builds a new system for automatically assessing learners' speech elicited from an oral discourse completion task (DCT), and evaluates the prediction capability of the system with a view to better understanding factors deemed influential in predicting speaking proficiency scores and the pedagogical implications of the system.…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Japanese
Peer reviewed Peer reviewed
Direct linkDirect link
Heidari, Nasim; Ghanbari, Nasim; Abbasi, Abbas – Language Testing in Asia, 2022
It is widely believed that human rating performance is influenced by an array of different factors. Among these, rater-related variables such as experience, language background, perceptions, and attitudes have been mentioned. One of the important rater-related factors is the way the raters interact with the rating scales. In particular, how raters…
Descriptors: Evaluators, Rating Scales, Language Tests, English (Second Language)
Peer reviewed Peer reviewed
Direct linkDirect link
Cox, Troy L.; Brown, Alan V.; Thompson, Gregory L. – Language Testing, 2023
The rating of proficiency tests that use the Inter-agency Roundtable (ILR) and American Council on the Teaching of Foreign Languages (ACTFL) guidelines claims that each major level is based on hierarchal linguistic functions that require mastery of multidimensional traits in such a way that each level subsumes the levels beneath it. These…
Descriptors: Oral Language, Language Fluency, Scoring, Cues
Peer reviewed Peer reviewed
Direct linkDirect link
Finn, Bridgid; Arslan, Burcu; Walsh, Matthew – Applied Measurement in Education, 2020
To score an essay response, raters draw on previously trained skills and knowledge about the underlying rubric and score criterion. Cognitive processes such as remembering, forgetting, and skill decay likely influence rater performance. To investigate how forgetting influences scoring, we evaluated raters' scoring accuracy on TOEFL and GRE essays.…
Descriptors: Epistemology, Essay Tests, Evaluators, Cognitive Processes
Peer reviewed Peer reviewed
Direct linkDirect link
Li, Shuai; Wen, Ting; Li, Xian; Feng, Yali; Lin, Chuan – Language Testing, 2023
This study compared holistic and analytic marking methods for their effects on parameter estimation (of examinees, raters, and items) and rater cognition in assessing speech act production in L2 Chinese. Seventy American learners of Chinese completed an oral Discourse Completion Test assessing requests and refusals. Four first-language (L1)…
Descriptors: Speech Acts, Second Language Learning, Second Language Instruction, Chinese
Peer reviewed Peer reviewed
Direct linkDirect link
Ma, Wenyue; Winke, Paula – Language Assessment Quarterly, 2022
The factors that influence rater scoring have been a subject of great interest to researchers in second language assessment. However, the research on the impact of test-takers' speech profiles (e.g., a jagged or a flat profile reflecting analytic subscores) on raters' scoring behaviors remains to be seen. To investigate the role of speech profiles…
Descriptors: Language Tests, Second Language Learning, Speech Communication, Profiles
Peer reviewed Peer reviewed
Direct linkDirect link
Han, Chao; Lu, Xiaolei – Computer Assisted Language Learning, 2023
The use of translation and interpreting (T&I) in the language learning classroom is commonplace, serving various pedagogical and assessment purposes. Previous utilization of T&I exercises is driven largely by their potential to enhance language learning, whereas the latest trend has begun to underscore T&I as a crucial skill to be…
Descriptors: Translation, Computational Linguistics, Correlation, Language Processing
Peer reviewed Peer reviewed
Direct linkDirect link
Xu, Jing; Jones, Edmund; Laxton, Victoria; Galaczi, Evelina – Assessment in Education: Principles, Policy & Practice, 2021
Recent advances in machine learning have made automated scoring of learner speech widespread, and yet validation research that provides support for applying automated scoring technology to assessment is still in its infancy. Both the educational measurement and language assessment communities have called for greater transparency in describing…
Descriptors: Second Language Learning, Second Language Instruction, English (Second Language), Computer Software
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Thai, Thuy; Sheehan, Susan – Language Education & Assessment, 2022
In language performance tests, raters are important as their scoring decisions determine which aspects of performance the scores represent; however, raters are considered as one of the potential sources contributing to unwanted variability in scores (Davis, 2012). Although a great number of studies have been conducted to unpack how rater…
Descriptors: Rating Scales, Speech Communication, Second Language Learning, Second Language Instruction
Peer reviewed Peer reviewed
Direct linkDirect link
Hsu, Tammy Huei-Lien – Language Testing in Asia, 2019
Background: A strong interest in researching World Englishes (WE) in relation to language assessment has become an emerging theme in language assessment studies over the past two decades. While research on WE has highlighted the status, function, and legitimacy of varieties of English language, it remains unclear how raters respond to the results…
Descriptors: Language Attitudes, Language Variation, Language Tests, Second Language Learning
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ahmadi, Alireza – Taiwan Journal of TESOL, 2020
Rater subjectivity has long been an intriguing topic. The use of discussion as a resolution method is a practical way to reduce this subjectivity. However, the efficacy of discussion depends on whether different raters get equally engaged in it or one rater tends to dominate others. This study investigated whether and how rater dominance occurs in…
Descriptors: Evaluators, Interrater Reliability, Discussion, Discourse Analysis
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4