ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	4
Since 2017 (last 10 years)	12
Since 2007 (last 20 years)	15

Descriptor

Computer Software	20
Language Tests	20
Test Items	20
English (Second Language)	16
Second Language Learning	15
Comparative Analysis	8
Foreign Countries	8
Scores	8
Second Language Instruction	8
Accuracy	7
Computer Assisted Testing	7
Item Analysis	7
College Students	6
Difficulty Level	6
Language Proficiency	5
Statistical Analysis	5
Academic Language	4
Computational Linguistics	4
Listening Comprehension Tests	4
Scoring	4
Test Construction	4
Testing	4
Correlation	3
Language Processing	3
Linguistic Input	3
More ▼

Source

Language Testing	3
ETS Research Report Series	2
English Language Teaching	1
InSight: A Journal of…	1
International Journal of…	1
International Journal of…	1
JALT CALL Journal	1
Journal of Language and…	1
Language Assessment Quarterly	1
Language Testing in Asia	1
ProQuest LLC	1
System	1
Taiwan Journal of TESOL	1
Vocabulary Learning and…	1
More ▼

Publication Type

Reports - Research	17
Journal Articles	16
Tests/Questionnaires	2
Books	1
Collected Works - General	1
Dissertations/Theses -…	1
Reports - Evaluative	1

Education Level

Higher Education	7
Postsecondary Education	6

Audience

Practitioners	1
Teachers	1

Location

Japan	3
Japan (Tokyo)	1
Saudi Arabia	1
Taiwan	1
Yemen	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	3
International English…	2
Michigan Test of English…	1
Test of English for…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

A Comprehensive Review of Rasch Measurement in Language Assessment: Recommendations and Guidelines for Research

Peer reviewed

Direct link

Aryadoust, Vahid; Ng, Li Ying; Sayama, Hiroki – Language Testing, 2021

Over the past decades, the application of Rasch measurement in language assessment has gradually increased. In the present study, we coded 215 papers using Rasch measurement published in 21 applied linguistics journals for multiple features. We found that seven Rasch models and 23 software packages were adopted in these papers, with many-facet…

Descriptors: Language Tests, Testing, Test Items, Network Analysis

Using Data Preprocessing Techniques and Machine Learning Algorithms to Explore Predictors of Word Difficulty in English Language Assessment

Direct link

Mingying Zheng – ProQuest LLC, 2024

The digital transformation in educational assessment has led to the proliferation of large-scale data, offering unprecedented opportunities to enhance language learning, and testing through machine learning (ML) techniques. Drawing on the extensive data generated by online English language assessments, this dissertation investigates the efficacy…

Descriptors: Artificial Intelligence, Computational Linguistics, Language Tests, English (Second Language)

Calibrated Parsing Items Evaluation: A Step towards Objectifying the Translation Assessment

Peer reviewed

Direct link

Akbari, Alireza; Shahnazari, Mohammadtaghi – Language Testing in Asia, 2019

The present research paper introduces a translation evaluation method called Calibrated Parsing Items Evaluation (CPIE hereafter). This evaluation method maximizes translators' performance through identifying the parsing items with an optimal p-docimology and d-index (item discrimination). This method checks all the possible parses (annotations)…

Descriptors: Test Items, Translation, Computer Software, Evaluators

Evaluation of Automated Vocabulary Quiz Generation with VocQGen

Peer reviewed
PDF on ERIC

Download full text

Qiao Wang; Ralph L. Rose; Ayaka Sugawara; Naho Orita – Vocabulary Learning and Instruction, 2025

VocQGen is an automated tool designed to generate multiple-choice cloze (MCC) questions for vocabulary assessment in second language learning contexts. It leverages several natural language processing (NLP) tools and OpenAI's GPT-4 model to produce MCC items quickly from user-specified word lists. To evaluate its effectiveness, we used the first…

Descriptors: Vocabulary Skills, Artificial Intelligence, Computer Software, Multiple Choice Tests

A Cognitive Diagnostic Assessment Study of the Reading Comprehension Section of the Preliminary English Test (PET)

Peer reviewed
PDF on ERIC

Download full text

Mohammed, Aisha; Dawood, Abdul Kareem Shareef; Alghazali, Tawfeeq; Kadhim, Qasim Khlaif; Sabti, Ahmed Abdulateef; Sabit, Shaker Holh – International Journal of Language Testing, 2023

Cognitive diagnostic models (CDMs) have received much interest within the field of language testing over the last decade due to their great potential to provide diagnostic feedback to all stakeholders and ultimately improve language teaching and learning. A large number of studies have demonstrated the application of CDMs on advanced large-scale…

Descriptors: Reading Comprehension, Reading Tests, Language Tests, English (Second Language)

A Comparability Study of Text Difficulty and Task Characteristics of Parallel Academic IELTS Reading Tests

Peer reviewed
PDF on ERIC

Download full text

Liao, Linyu – English Language Teaching, 2020

As a high-stakes standardized test, IELTS is expected to have comparable forms of test papers so that test takers from different test administration on different dates receive comparable test scores. Therefore, this study examined the text difficulty and task characteristics of four parallel academic IELTS reading tests to reveal to what extent…

Descriptors: Second Language Learning, English (Second Language), Language Tests, High Stakes Tests

SARM: A Computer Program for Estimating Speed-Accuracy Response Models for Dichotomous Items. Research Report. ETS RR-18-15

Peer reviewed
PDF on ERIC

Download full text

van Rijn, Peter W.; Ali, Usama S. – ETS Research Report Series, 2018

A computer program was developed to estimate speed-accuracy response models for dichotomous items. This report describes how the models are estimated and how to specify data and input files. An example using data from a listening section of an international language test is described to illustrate the modeling approach and features of the computer…

Descriptors: Computer Software, Computation, Reaction Time, Timed Tests

Assessing Rasch Measurement Estimation Methods across R Packages with Yes/No Vocabulary Test Data

Peer reviewed

Direct link

Nicklin, Christopher; Vitta, Joseph P. – Language Testing, 2022

Instrument measurement conducted with Rasch analysis is a common process in language assessment research. A recent systematic review of 215 studies involving Rasch analysis in language testing and applied linguistics research reported that 23 different software packages had been utilized. However, none of the analyses were conducted with one of…

Descriptors: Programming Languages, Vocabulary Development, Language Tests, Computer Software

Challenges of Translating Neologisms Comparative Study: Human and Machine Translation

Peer reviewed
PDF on ERIC

Download full text

Awadh, Awadh Nasser Munassar; Khan, Ansarullah Shafiull – Journal of Language and Linguistic Studies, 2020

This study aims at investigating the challenges that Yemeni translation students encounter when translating neologisms from English into Arabic. It also aims at comparing students' translation with outcomes of machine translation (MT). The authors follow the descriptive and comparative methods in conducting this study. To achieve the objective of…

Descriptors: Barriers, Translation, English (Second Language), Semitic Languages

Making Better Tests with the Rasch Measurement Model

Peer reviewed
PDF on ERIC

Download full text

Karlin, Omar; Karlin, Sayaka – InSight: A Journal of Scholarly Teaching, 2018

This study had two aims. The first was to explain the process of using the Rasch measurement model to validate tests in an easy-to-understand way for those unfamiliar with the Rasch measurement model. The second was to validate two final exams with several shared items. The exams were given to two groups of students with slightly differing English…

Descriptors: Item Response Theory, Test Validity, Test Items, Accuracy

Content-Rich versus Content Deficient Video-Based Visuals in L2 Academic Listening Tests: Pilot Study

Peer reviewed

Direct link

Lesnov, Roman Olegovich – International Journal of Computer-Assisted Language Learning and Teaching, 2018

This article compares second language test-takers' performance on an academic listening test in an audio-only mode versus an audio-video mode. A new method of classifying video-based visuals was developed and piloted, which used L2 expert opinions to place the video on a continuum from being content-deficient (not helpful for answering…

Descriptors: Second Language Learning, Second Language Instruction, Video Technology, Classification

Comparison of Word Recognition Strategies in EFL Adult Learners: Orthography vs. Phonology

Peer reviewed
PDF on ERIC

Download full text

Sieh, Yu-cheng – Taiwan Journal of TESOL, 2016

In an attempt to compare how orthography and phonology interact in EFL learners with different reading abilities, online measures were administered in this study to two groups of university learners, indexed by their reading scores on the Test of English for International Communication (TOEIC). In terms of "accuracy," the less-skilled…

Descriptors: Comparative Analysis, Word Recognition, Phonology, English (Second Language)

How Accurately Can the Google Web Speech API Recognize and Transcribe Japanese L2 English Learners' Oral Production?

Peer reviewed
PDF on ERIC

Download full text

Ashwell, Tim; Elam, Jesse R. – JALT CALL Journal, 2017

The ultimate aim of our research project was to use the Google Web Speech API to automate scoring of elicited imitation (EI) tests. However, in order to achieve this goal, we had to take a number of preparatory steps. We needed to assess how accurate this speech recognition tool is in recognizing native speakers' production of the test items; we…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Language Tests

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

The Relative Difficulty of Dialogic and Monologic Input in a Second-Language Listening Comprehension Test

Peer reviewed

Direct link

Papageorgiou, Spiros; Stevens, Robin; Goodwin, Sarah – Language Assessment Quarterly, 2012

Listening comprehension tests typically include both monologic and dialogic input to measure listening ability. However, research as to which type of input is more challenging for examinees remains limited and has provided inconclusive results (Brindley & Slatyer, 2002; Read, 2002; Shohamy & Inbar, 1991). A better understanding of the…

Descriptors: Listening Comprehension Tests, Test Items, Content Analysis, Listening Comprehension

Previous Page | Next Page »

Pages: 1 | 2

Akbari, Alireza	1
Alderson, J. Charles	1
Alghazali, Tawfeeq	1
Ali, Usama S.	1
Ariew, Robert A.	1
Aryadoust, Vahid	1
Ashwell, Tim	1
Awadh, Awadh Nasser Munassar	1
Ayaka Sugawara	1
Breyer, F. Jay	1
Davidson, Fred	1
Dawood, Abdul Kareem Shareef	1
Dunkel, Patricia A.	1
Elam, Jesse R.	1
Goodwin, Sarah	1
Kadhim, Qasim Khlaif	1
Karlin, Omar	1
Karlin, Sayaka	1
Khan, Ansarullah Shafiull	1
Lesnov, Roman Olegovich	1
Liao, Linyu	1
Lorenz, Florian	1
Madsen, Harold S.	1
Mingying Zheng	1
Mohammed, Aisha	1
More ▼