ERIC - Search Results

Publication Date

In 2026	0
Since 2025	3
Since 2022 (last 5 years)	22
Since 2017 (last 10 years)	61
Since 2007 (last 20 years)	104

Descriptor

Comparative Analysis	133
Second Language Learning	133
English (Second Language)	100
Second Language Instruction	71
Foreign Countries	66
Language Tests	61
Reliability	53
Test Reliability	44
Interrater Reliability	42
Language Proficiency	33
Scores	32
Correlation	27
Teaching Methods	26
Test Validity	24
College Students	21
Statistical Analysis	21
Validity	21
Evaluators	20
Computer Assisted Testing	19
Writing Evaluation	19
Pretests Posttests	17
Scoring	17
Computer Software	16
Essays	16
Language Teachers	16
More ▼

Publication Type

Journal Articles	111
Reports - Research	104
Tests/Questionnaires	15
Reports - Evaluative	14
Information Analyses	8
Speeches/Meeting Papers	8
Reports - Descriptive	4
Books	3
Collected Works - General	2
Guides - Non-Classroom	2
Collected Works - Proceedings	1
Dissertations/Theses -…	1
Dissertations/Theses -…	1
Non-Print Media	1
Numerical/Quantitative Data	1
Reference Materials - General	1
More ▼

Education Level

Higher Education	47
Postsecondary Education	39
Secondary Education	16
Elementary Education	13
High Schools	9
Elementary Secondary Education	5
Preschool Education	4
Early Childhood Education	3
Grade 8	3
Middle Schools	3
Adult Education	2
Grade 10	2
Grade 11	2
Grade 6	2
Intermediate Grades	2
Junior High Schools	2
Grade 12	1
Grade 4	1
Grade 7	1
Grade 9	1
Two Year Colleges	1
More ▼

Audience

Teachers	3
Practitioners	2
Researchers	2
Administrators	1

Location

Iran	11
China	8
Turkey	7
Japan	4
Saudi Arabia	4
Canada	3
Hong Kong	3
Pakistan	3
Australia	2
Denmark	2
Egypt	2
Europe	2
Germany	2
Greece	2
Netherlands	2
Philippines	2
South Korea	2
Spain	2
Taiwan	2
Thailand	2
United States	2
Vietnam	2
Arizona	1
Asia	1
Austria	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Test of English as a Foreign…	8
International English…	3
SAT (College Admission Test)	2
ACTFL Oral Proficiency…	1
Constructivist Learning…	1
Dale Chall Readability Formula	1
English Proficiency Test	1
Expressive One Word Picture…	1
Flesch Kincaid Grade Level…	1
Flesch Reading Ease Formula	1
Fry Readability Formula	1
Graduate Management Admission…	1
Graduate Record Examinations	1
Mean Length of Utterance	1
Peabody Picture Vocabulary…	1
Strategy Inventory for…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 133 results Save | Export

Examining the Effect of Item Difficulty and Rater Leniency on Iranian Test Takers' Performance on WDCT and DSAT: A Comparative Study

Peer reviewed
PDF on ERIC

Download full text

Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025

The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…

Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction

Utilizing Large Language Models for EFL Essay Grading: An Examination of Reliability and Validity in Rubric-Based Assessments

Peer reviewed

Direct link

Fatih Yavuz; Özgür Çelik; Gamze Yavas Çelik – British Journal of Educational Technology, 2025

This study investigates the validity and reliability of generative large language models (LLMs), specifically ChatGPT and Google's Bard, in grading student essays in higher education based on an analytical grading rubric. A total of 15 experienced English as a foreign language (EFL) instructors and two LLMs were asked to evaluate three student…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Computational Linguistics

Comparative Judgement for Evaluating Young Learners' EFL Writing Performances: Reliability and Teacher Perceptions of Holistic and Dimension-Based Judgements

Peer reviewed

Direct link

Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025

Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…

Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction

Effective Vocabulary Interventions for Young Emergent Bilinguals: A Best-Evidence Synthesis

Peer reviewed

Direct link

Alain Bengochea; Sabrina F. Sembiante – Review of Education, 2024

This best-evidence synthesis appraises the design and outcome characteristics of vocabulary intervention studies conducted with preschool through 6th grade emergent bilingual (EB) children and spotlights rigorously designed studies for which effects could be better attributed to instructional features. Twenty-nine selected studies were analysed…

Descriptors: Bilingualism, Vocabulary Development, Intervention, Comparative Analysis

Estimating the Impact of Local Item Dependency in a Test of Second Language Reading Comprehension

Peer reviewed
PDF on ERIC

Download full text

Tim Stoeckel; Liang Ye Tan; Hung Tan Ha; Nam Thi Phuong Ho; Tomoko Ishii; Young Ae Kim; Chunmei Huang; Stuart McLean – Vocabulary Learning and Instruction, 2024

Local item dependency (LID) occurs when test-takers' responses to one test item are affected by their responses to another. It can be problematic if it causes inflated reliability estimates or distorted person and item measures. The cued-recall reading comprehension test in Hu and Nation's (2000) well-known and influential coverage--comprehension…

Descriptors: Reading Comprehension, English (Second Language), Second Language Instruction, Second Language Learning

The Intersection of AI and Language Assessment: A Study on the Reliability of ChatGPT in Grading IELTS Writing Task 2

Peer reviewed
PDF on ERIC

Download full text

Osama Koraishi – Language Teaching Research Quarterly, 2024

This study conducts a comprehensive quantitative evaluation of OpenAI's language model, ChatGPT 4, for grading Task 2 writing of the IELTS exam. The objective is to assess the alignment between ChatGPT's grading and that of official human raters. The analysis encompassed a multifaceted approach, including a comparison of means and reliability…

Descriptors: Second Language Learning, English (Second Language), Language Tests, Artificial Intelligence

Meta-Analysis of Inter-Rater Agreement and Discrepancy Between Human and Automated English Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Direct link

Jiyeo Yun – English Teaching, 2023

Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…

Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring

Depth-Perception-Based Representation in Holistic Rating on ESL Essay Writing

Peer reviewed

Direct link

Lian Li; Jiehui Hu; Yu Dai; Ping Zhou; Wanhong Zhang – Reading & Writing Quarterly, 2024

This paper proposes to use depth perception to represent raters' decision in holistic evaluation of ESL essays, as an alternative medium to conventional form of numerical scores. The researchers verified the new method's accuracy and inter/intra-rater reliability by inviting 24 ESL teachers to perform different representations when rating 60…

Descriptors: Essays, Holistic Approach, Writing Evaluation, Accuracy

Can Recall Data Be Trusted? Evaluating Reliability of Interview Data on Traditional Multilingualism in Highland Daghestan

Peer reviewed

Direct link

Daniel, Michael; Koshevoy, Alexey; Schurov, Ilya; Dobrushina, Nina – Field Methods, 2022

In this article, we address the issue of reliability of quantitative data on multilingualism of the past obtained as recall data. More specifically, we investigate whether the interviewees' assessments of the language repertoires of their late relatives (indirect data) provide results that are quantitatively similar to those obtained from the…

Descriptors: Recall (Psychology), Multilingualism, Artificial Intelligence, Second Languages

Rubric Rating with MFRM versus Randomly Distributed Comparative Judgment: A Comparison of Two Approaches to Second-Language Writing Assessment

Peer reviewed

Direct link

Sims, Maureen E.; Cox, Troy L.; Eckstein, Grant T.; Hartshorn, K. James; Wilcox, Matthew P.; Hart, Judson M. – Educational Measurement: Issues and Practice, 2020

The purpose of this study is to explore the reliability of a potentially more practical approach to direct writing assessment in the context of ESL writing. Traditional rubric rating (RR) is a common yet resource-intensive evaluation practice when performed reliably. This study compared the traditional rubric model of ESL writing assessment and…

Descriptors: Scoring Rubrics, Item Response Theory, Second Language Learning, English (Second Language)

Automated Assessment of Second Language Comprehensibility: Review, Training, Validation, and Generalization Studies

Peer reviewed

Direct link

Saito, Kazuya; Macmillan, Konstantinos; Kachlicka, Magdalena; Kunihara, Takuya; Minematsu, Nobuaki – Studies in Second Language Acquisition, 2023

Whereas many scholars have emphasized the relative importance of "comprehensibility" as an ecologically valid goal for L2 speech training, testing, and development, eliciting listeners' judgments is time-consuming. Following calls for research on more efficient L2 speech rating methods in applied linguistics, and growing attention toward…

Descriptors: Second Language Learning, Second Language Instruction, Interrater Reliability, Speech Communication

The Effects of Multimodal Teaching on English Vocabulary Knowledge of Thai Primary School Students

Peer reviewed
PDF on ERIC

Download full text

Kasikarn Bansong; Somkiet Poopatwiboon; Apisak Sukying – Journal of Education and Learning, 2023

It is increasingly prevalent in digital learning and teaching strategies for discerning a global perspective on creating the student learning experience. Multimodality is an emergent phenomenon that may influence how digital learning is designed, especially during the COVID-19 pandemic in which immersive learning environments, such as a virtual…

Descriptors: Elementary School Students, English (Second Language), Second Language Learning, Second Language Instruction

Impacts of ChatGPT-Assisted Writing for EFL English Majors: Feasibility and Challenges

Peer reviewed

Direct link

Chung-You Tsai; Yi-Ti Lin; Iain Kelsall Brown – Education and Information Technologies, 2024

To determine the impacts of using ChatGPT to assist English as a foreign language (EFL) English college majors in revising essays and the possibility of leading to higher scores and potentially causing unfairness. A prospective, double-blinded, paired-comparison study was conducted in Feb. 2023. A total of 44 students provided 44 original essays…

Descriptors: Artificial Intelligence, Computer Software, Technology Uses in Education, English (Second Language)

Can Didactic Audiovisual Translation Enhance Intercultural Learning through CALL? Validity and Reliability of a Students' Questionnaire

Peer reviewed

Direct link

Pilar Rodríguez-Arancón; María Bobadilla-Pérez; Alberto Fernández-Costales – Journal for Multicultural Education, 2024

Purpose: This study aims to delve into the interplay between didactic audiovisual translation (DAT) and computer-assisted language learning (CALL), exploring their combined impact on the development of intercultural competence (IC) among learners of English as a foreign language (EFL). Design/methodology/approach: Using a quasi-experimental…

Descriptors: Translation, Teaching Methods, Second Language Learning, Second Language Instruction

Applying Generalizability Theory in Language Testing: Comparing Nested and Crossed Scoring Designs in the Assessment of Speaking Skills

Peer reviewed
PDF on ERIC

Download full text

Polat, Murat; Turhan, Nihan Sölpük – International Journal of Curriculum and Instruction, 2021

Scoring language learners' speaking skills is open to a number of measurement errors since raters' personal judgements could involve in the process. Different grading designs in which raters score a student's whole speaking skills or a specific dimension of the speaking performance could be settled to control and minimize the amount of the error…

Descriptors: Language Tests, Scoring, Speech Communication, State Universities

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Language Testing	14
English Language Teaching	7
ETS Research Report Series	4
Foreign Language Annals	4
Online Submission	4
Language Assessment Quarterly	3
Language Learning Journal	3
Assessing Writing	2
Cogent Education	2
English Teaching	2
International Journal of…	2
International Journal of…	2
International Journal of…	2
Journal of Language and…	2
Journal of Speech, Language,…	2
Language Testing in Asia	2
Language, Speech, and Hearing…	2
ReCALL	2
Research-publishing.net	2
Studies in Second Language…	2
System: An International…	2
TESOL International Journal	2
Working Papers in TESOL &…	2
Advances in Language and…	1
Arab World English Journal	1
More ▼

Coniam, David	3
Attali, Yigal	2
August, Diane	2
Cox, Troy L.	2
Kunnan, Antony John	2
Winke, Paula	2
Adams, R. J.	1
Ahmadi Shirazi, Masoumeh	1
Ahmadi, Alireza	1
Ahn, Jieun Irene	1
Ahour, Touran	1
Akinwamide, Timothy Kolade	1
Al-Hoorie, Ali H.	1
Alain Bengochea	1
Alberto Fernández-Costales	1
Alhaisoni, Eid	1
Alharthi, Saleh	1
Alsamadani, Hashem Ahmed	1
Alsree, Zubaida	1
Alt, Mary	1
Amin, Yadhi Nur	1
Apisak Sukying	1
Arani, Davood Khedmatkar	1
Ardasheva, Yuliya	1
More ▼