ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	10
Since 2016 (last 10 years)	22
Since 2006 (last 20 years)	45

Descriptor

Evaluators	49
Language Tests	49
Scoring	49
Second Language Learning	41
English (Second Language)	37
Computer Assisted Testing	17
Foreign Countries	16
Correlation	15
Interrater Reliability	15
Scores	14
Language Proficiency	12
Oral Language	12
Comparative Analysis	11
Evaluation Criteria	11
Rating Scales	11
Computer Software	10
Essays	10
Second Language Instruction	10
Speech Communication	10
Accuracy	9
Native Language	8
Writing Evaluation	8
Writing Tests	8
Evaluation Methods	7
Cues	6
More ▼

Publication Type

Journal Articles	41
Reports - Research	41
Tests/Questionnaires	9
Speeches/Meeting Papers	4
Dissertations/Theses -…	3
Information Analyses	2
Reports - Descriptive	2
Reports - Evaluative	2

Education Level

Higher Education	10
Postsecondary Education	8
High Schools	2
Secondary Education	2

Audience

Location

China	5
India	3
Iran	3
Europe	2
California	1
Germany	1
Japan	1
Japan (Tokyo)	1
Michigan	1
South Korea	1
Switzerland	1
Vietnam	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	20
International English…	5
Test of English for…	2
ACTFL Oral Proficiency…	1
Alabama High School…	1
Graduate Record Examinations	1

What Works Clearinghouse Rating

Showing 1 to 15 of 49 results Save | Export

Detecting Rater Centrality Effects in Performance Assessments: A Model-Based Comparison of Centrality Indices

Peer reviewed

Direct link

Jin, Kuan-Yu; Eckes, Thomas – Measurement: Interdisciplinary Research and Perspectives, 2022

Recent research on rater effects in performance assessments has increasingly focused on rater centrality, the tendency to assign scores clustering around the rating scale's middle categories. In the present paper, we adopted Jin and Wang's (2018) extended facets modeling approach and constructed a centrality continuum, ranging from raters…

Descriptors: Performance Based Assessment, Evaluators, Scoring, Sample Size

Operationalizing the Reading-into-Writing Construct in Analytic Rating Scales: Effects of Different Approaches on Rating

Peer reviewed

Direct link

Lestari, Santi B.; Brunfaut, Tineke – Language Testing, 2023

Assessing integrated reading-into-writing task performances is known to be challenging, and analytic rating scales have been found to better facilitate the scoring of these performances than other common types of rating scales. However, little is known about how specific operationalizations of the reading-into-writing construct in analytic rating…

Descriptors: Reading Writing Relationship, Writing Tests, Rating Scales, Writing Processes

The Relationship between Rater Experience and Performance Ratings: A Systematic Review

Peer reviewed

Direct link

Huang, Jing; Chen, Gaowei – AERA Online Paper Repository, 2019

This research investigates the effects of rater experience on performance ratings in language testing using a systematic review of studies published from 1985 to 2017. Based on a comprehensive literature search of 14 databases, we identified sixteen relevant papers. With these we conducted a narrative review to conceptualize a theoretical…

Descriptors: Language Tests, Experience, Evaluators, Performance Based Assessment

The Impact of Operational Scoring Experience and Additional Mentored Training on Raters' Essay Scoring Accuracy

Peer reviewed

Direct link

Choi, Ikkyu; Wolfe, Edward W. – Applied Measurement in Education, 2020

Rater training is essential in ensuring the quality of constructed response scoring. Most of the current knowledge about rater training comes from experimental contexts with an emphasis on short-term effects. Few sources are available for empirical evidence on whether and how raters become more accurate as they gain scoring experiences or what…

Descriptors: Scoring, Accuracy, Training, Evaluators

Automated Speech Scoring of Dialogue Response by Japanese Learners of English as a Foreign Language

Peer reviewed

Direct link

Yuko Hayashi; Yusuke Kondo; Yutaka Ishii – Innovation in Language Learning and Teaching, 2024

Purpose: This study builds a new system for automatically assessing learners' speech elicited from an oral discourse completion task (DCT), and evaluates the prediction capability of the system with a view to better understanding factors deemed influential in predicting speaking proficiency scores and the pedagogical implications of the system.…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Japanese

Raters' Perceptions of Rating Scales Criteria and Its Effect on the Process and Outcome of Their Rating

Peer reviewed

Direct link

Heidari, Nasim; Ghanbari, Nasim; Abbasi, Abbas – Language Testing in Asia, 2022

It is widely believed that human rating performance is influenced by an array of different factors. Among these, rater-related variables such as experience, language background, perceptions, and attitudes have been mentioned. One of the important rater-related factors is the way the raters interact with the rating scales. In particular, how raters…

Descriptors: Evaluators, Rating Scales, Language Tests, English (Second Language)

Temporal Fluency and Floor/Ceiling Scoring of Intermediate and Advanced Speech on the ACTFL Spanish Oral Proficiency Interview--Computer

Peer reviewed

Direct link

Cox, Troy L.; Brown, Alan V.; Thompson, Gregory L. – Language Testing, 2023

The rating of proficiency tests that use the Inter-agency Roundtable (ILR) and American Council on the Teaching of Foreign Languages (ACTFL) guidelines claims that each major level is based on hierarchal linguistic functions that require mastery of multidimensional traits in such a way that each level subsumes the levels beneath it. These…

Descriptors: Oral Language, Language Fluency, Scoring, Cues

Applying Cognitive Theory to the Human Essay Rating Process

Peer reviewed

Direct link

Finn, Bridgid; Arslan, Burcu; Walsh, Matthew – Applied Measurement in Education, 2020

To score an essay response, raters draw on previously trained skills and knowledge about the underlying rubric and score criterion. Cognitive processes such as remembering, forgetting, and skill decay likely influence rater performance. To investigate how forgetting influences scoring, we evaluated raters' scoring accuracy on TOEFL and GRE essays.…

Descriptors: Epistemology, Essay Tests, Evaluators, Cognitive Processes

Comparing Holistic and Analytic Marking Methods in Assessing Speech Act Production in L2 Chinese

Peer reviewed

Direct link

Li, Shuai; Wen, Ting; Li, Xian; Feng, Yali; Lin, Chuan – Language Testing, 2023

This study compared holistic and analytic marking methods for their effects on parameter estimation (of examinees, raters, and items) and rater cognition in assessing speech act production in L2 Chinese. Seventy American learners of Chinese completed an oral Discourse Completion Test assessing requests and refusals. Four first-language (L1)…

Descriptors: Speech Acts, Second Language Learning, Second Language Instruction, Chinese

An Investigation of the Impact of Jagged Profile on L2 Speaking Test Ratings: Evidence from Rating and Eye-Tracking Data

Peer reviewed

Direct link

Ma, Wenyue; Winke, Paula – Language Assessment Quarterly, 2022

The factors that influence rater scoring have been a subject of great interest to researchers in second language assessment. However, the research on the impact of test-takers' speech profiles (e.g., a jagged or a flat profile reflecting analytic subscores) on raters' scoring behaviors remains to be seen. To investigate the role of speech profiles…

Descriptors: Language Tests, Second Language Learning, Speech Communication, Profiles

Can Automated Machine Translation Evaluation Metrics Be Used to Assess Students' Interpretation in the Language Learning Classroom?

Peer reviewed

Direct link

Han, Chao; Lu, Xiaolei – Computer Assisted Language Learning, 2023

The use of translation and interpreting (T&I) in the language learning classroom is commonplace, serving various pedagogical and assessment purposes. Previous utilization of T&I exercises is driven largely by their potential to enhance language learning, whereas the latest trend has begun to underscore T&I as a crucial skill to be…

Descriptors: Translation, Computational Linguistics, Correlation, Language Processing

Assessing L2 English Speaking Using Automated Scoring Technology: Examining Automarker Reliability

Peer reviewed

Direct link

Xu, Jing; Jones, Edmund; Laxton, Victoria; Galaczi, Evelina – Assessment in Education: Principles, Policy & Practice, 2021

Recent advances in machine learning have made automated scoring of learner speech widespread, and yet validation research that provides support for applying automated scoring technology to assessment is still in its infancy. Both the educational measurement and language assessment communities have called for greater transparency in describing…

Descriptors: Second Language Learning, Second Language Instruction, English (Second Language), Computer Software

The Processes of Rating L2 Speaking Performance Using an Analytic Rating Scale -- A Qualitative Exploration

Peer reviewed
PDF on ERIC

Download full text

Thai, Thuy; Sheehan, Susan – Language Education & Assessment, 2022

In language performance tests, raters are important as their scoring decisions determine which aspects of performance the scores represent; however, raters are considered as one of the potential sources contributing to unwanted variability in scores (Davis, 2012). Although a great number of studies have been conducted to unpack how rater…

Descriptors: Rating Scales, Speech Communication, Second Language Learning, Second Language Instruction

Rater Attitude towards Emerging Varieties of English: A New Rater Effect?

Peer reviewed

Direct link

Hsu, Tammy Huei-Lien – Language Testing in Asia, 2019

Background: A strong interest in researching World Englishes (WE) in relation to language assessment has become an emerging theme in language assessment studies over the past two decades. While research on WE has highlighted the status, function, and legitimacy of varieties of English language, it remains unclear how raters respond to the results…

Descriptors: Language Attitudes, Language Variation, Language Tests, Second Language Learning

Rater Dominance in Discussion as a Resolution Method

Peer reviewed
PDF on ERIC

Download full text

Ahmadi, Alireza – Taiwan Journal of TESOL, 2020

Rater subjectivity has long been an intriguing topic. The use of discussion as a resolution method is a practical way to reduce this subjectivity. However, the efficacy of discussion depends on whether different raters get equally engaged in it or one rater tends to dominate others. This study investigated whether and how rater dominance occurs in…

Descriptors: Evaluators, Interrater Reliability, Discussion, Discourse Analysis

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Language Testing	13
ETS Research Report Series	9
ProQuest LLC	3
Applied Measurement in…	2
Language Assessment Quarterly	2
Language Testing in Asia	2
AERA Online Paper Repository	1
Assessment in Education:…	1
Computer Assisted Language…	1
English Language Teaching	1
Grantee Submission	1
Innovation in Language…	1
JALT CALL Journal	1
Journal of Pan-Pacific…	1
Language Education &…	1
Language Learning	1
Measurement:…	1
Research-publishing.net	1
SAGE Open	1
Taiwan Journal of TESOL	1
World Englishes	1
More ▼

Xi, Xiaoming	4
Bejar, Isaac I.	2
Bridgeman, Brent	2
Eckes, Thomas	2
Mollaun, Pam	2
Mollaun, Pamela	2
Winke, Paula	2
Abbasi, Abbas	1
Ahmadi Shirazi, Masoumeh	1
Ahmadi, Alireza	1
Allen, Laura K.	1
Arslan, Burcu	1
Attali, Yigal	1
Barkaoui, Khaled	1
Blanchard, Daniel	1
Breyer, F. Jay	1
Brooks, Rachel Lunde	1
Brown, Alan V.	1
Brown, Anne	1
Brunfaut, Tineke	1
Cahill, Aoife	1
Carey, Michael D.	1
Casabianca, Jodi M.	1
Chen, Gaowei	1
Chodorow, Martin	1
More ▼