ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	14

Descriptor

Comparative Analysis	19
Test Items	19
Language Tests	17
English (Second Language)	14
Foreign Countries	11
Second Language Learning	10
Item Analysis	7
Test Format	7
College Students	6
Item Response Theory	5
Reading Tests	5
Accuracy	4
Multiple Choice Tests	4
Reading Comprehension	4
College Entrance Examinations	3
Computer Assisted Testing	3
Difficulty Level	3
Factor Analysis	3
Language Proficiency	3
Listening Comprehension Tests	3
Measurement Techniques	3
Scores	3
Second Language Instruction	3
Statistical Analysis	3
Test Construction	3
More ▼

Source

Language Testing

Publication Type

Journal Articles	19
Reports - Research	18
Numerical/Quantitative Data	1
Reports - Evaluative	1
Tests/Questionnaires	1

Education Level

Higher Education	9
Postsecondary Education	9
Secondary Education	4
Elementary Education	1
Grade 8	1
High Schools	1
Junior High Schools	1
Middle Schools	1

Audience

Location

Japan	6
Russia	2
Austria	1
Denmark	1
Iran	1
Norway	1
Saudi Arabia	1
Sweden	1
Turkey (Ankara)	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	5
Test of Written English	1

What Works Clearinghouse Rating

Showing 1 to 15 of 19 results Save | Export

A New Scoring Method for Item Response Theory Analysis of C-Tests

Peer reviewed

Direct link

Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025

This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…

Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction

Investigating the Impact of Self-Pacing on the L2 Listening Performance of Young Learner Candidates with Differing L1 Literacy Skills

Peer reviewed

Direct link

Eberharter, Kathrin; Kormos, Judit; Guggenbichler, Elisa; Ebner, Viktoria S.; Suzuki, Shungo; Moser-Frötscher, Doris; Konrad, Eva; Kremmel, Benjamin – Language Testing, 2023

In online environments, listening involves being able to pause or replay the recording as needed. Previous research indicates that control over the listening input could improve the measurement accuracy of listening assessment. Self-pacing also supports the second language (L2) comprehension processes of test-takers with specific learning…

Descriptors: Literacy, Native Language, Second Language Learning, Second Language Instruction

Developing and Evaluating a Computerized Adaptive Testing Version of the Word Part Levels Test

Peer reviewed

Direct link

Mizumoto, Atsushi; Sasao, Yosuke; Webb, Stuart A. – Language Testing, 2019

The knowledge about affix plays a vital role in the development of word knowledge and vocabulary acquisition. A test for diagnostic information on the level of affix knowledge would be useful in order to inform the test users of what learners have gained or lacked in this integral component of vocabulary knowledge. This paper reports the…

Descriptors: Computer Assisted Testing, Adaptive Testing, College Students, English (Second Language)

Assessing Rasch Measurement Estimation Methods across R Packages with Yes/No Vocabulary Test Data

Peer reviewed

Direct link

Nicklin, Christopher; Vitta, Joseph P. – Language Testing, 2022

Instrument measurement conducted with Rasch analysis is a common process in language assessment research. A recent systematic review of 215 studies involving Rasch analysis in language testing and applied linguistics research reported that 23 different software packages had been utilized. However, none of the analyses were conducted with one of…

Descriptors: Programming Languages, Vocabulary Development, Language Tests, Computer Software

IRT-Based Classification Analysis of an English Language Reading Proficiency Subtest

Peer reviewed

Direct link

Kaya, Elif; O'Grady, Stefan; Kalender, Ilker – Language Testing, 2022

Language proficiency testing serves an important function of classifying examinees into different categories of ability. However, misclassification is to some extent inevitable and may have important consequences for stakeholders. Recent research suggests that classification efficacy may be enhanced substantially using computerized adaptive…

Descriptors: Item Response Theory, Test Items, Language Tests, Classification

National Reading Tests in Denmark, Norway, and Sweden: A Comparison of Construct Definitions, Cognitive Targets, and Response Formats

Peer reviewed

Direct link

Tengberg, Michael – Language Testing, 2017

Reading comprehension tests are often assumed to measure the same, or at least similar, constructs. Yet, reading is not a single but a multidimensional form of processing, which means that variations in terms of reading material and item design may emphasize one aspect of the construct at the cost of another. The educational systems in Denmark,…

Descriptors: Foreign Countries, National Competency Tests, Reading Tests, Comparative Analysis

Patterns of Variation in the Interplay of Language Ability and General Reading Comprehension Ability in L2 Reading

Peer reviewed

Direct link

Löwenadler, John – Language Testing, 2019

This study aims to investigate patterns of variation in the interplay of L2 language ability and general reading comprehension skills in L2 reading, by comparing item-level effects of test-takers' results on L1 and L2 reading comprehension tests. The material comes from more than 500,000 people tested on L1 (Swedish) and L2 (English) in the…

Descriptors: Swedish, English (Second Language), Second Language Learning, Second Language Instruction

Probing the Relative Importance of Different Attributes in L2 Reading and Listening Comprehension Items: An Application of Cognitive Diagnostic Models

Peer reviewed

Direct link

Yi, Yeon-Sook – Language Testing, 2017

The present study examines the relative importance of attributes within and across items by applying four cognitive diagnostic assessment models. The current study utilizes the function of the models that can indicate inter-attribute relationships that reflect the response behaviors of examinees to analyze scored test-taker responses to four forms…

Descriptors: Second Language Learning, Reading Comprehension, Listening Comprehension, Language Tests

A Comparison of Three Test Formats to Assess Word Difficulty

Peer reviewed

Direct link

Culligan, Brent – Language Testing, 2015

This study compared three common vocabulary test formats, the Yes/No test, the Vocabulary Knowledge Scale (VKS), and the Vocabulary Levels Test (VLT), as measures of vocabulary difficulty. Vocabulary difficulty was defined as the item difficulty estimated through Item Response Theory (IRT) analysis. Three tests were given to 165 Japanese students,…

Descriptors: Language Tests, Test Format, Comparative Analysis, Vocabulary

Determining Cloze Item Difficulty from Item and Passage Characteristics across Different Learner Backgrounds

Peer reviewed

Direct link

Trace, Jonathan; Brown, James Dean; Janssen, Gerriet; Kozhevnikova, Liudmila – Language Testing, 2017

Cloze tests have been the subject of numerous studies regarding their function and use in both first language and second language contexts (e.g., Jonz & Oller, 1994; Watanabe & Koyama, 2008). From a validity standpoint, one area of investigation has been the extent to which cloze tests measure reading ability beyond the sentence level.…

Descriptors: Cloze Procedure, Language Tests, Test Items, Item Analysis

A Comparison of Video- and Audio-Mediated Listening Tests with Many-Facet Rasch Modeling and Differential Distractor Functioning

Peer reviewed

Direct link

Batty, Aaron Olaf – Language Testing, 2015

The rise in the affordability of quality video production equipment has resulted in increased interest in video-mediated tests of foreign language listening comprehension. Although research on such tests has continued fairly steadily since the early 1980s, studies have relied on analyses of raw scores, despite the growing prevalence of item…

Descriptors: Listening Comprehension Tests, Comparative Analysis, Video Technology, Audio Equipment

Effects of L1 Definitions and Cognate Status of Test Items on the Vocabulary Size Test

Peer reviewed

Direct link

Elgort, Irina – Language Testing, 2013

This study examines the development and evaluation of a bilingual Vocabulary Size Test (VST, Nation, 2006). A bilingual (English-Russian) test was developed and administered to 121 intermediate proficiency EFL learners (native speakers of Russian), alongside the original monolingual (English-only) version of the test. A comparison of the bilingual…

Descriptors: Test Construction, Vocabulary, Language Tests, English

Multiple Dichotomous-Scored Items in Second Language Testing: Investigating the Multiple True-False Item Type under Norm-Referenced Conditions

Peer reviewed

Direct link

Dudley, Albert – Language Testing, 2006

This study examined the multiple true-false (MTF) test format in second language testing by comparing multiple-choice (MCQ) and multiple true-false (MTF) test formats in two language areas of general English: vocabulary and reading. Two counter-balanced experimental designs--one for each language area--were examined in terms of the number of MCQ…

Descriptors: Second Language Learning, Test Format, Validity, Testing

A Comparison of Three- and Four-Option English Tests for University Entrance Selection Purposes in Japan

Peer reviewed

Direct link

Shizuka, Tetsuhito; Takeuchi, Osamu; Yashima, Tomoko; Yoshizawa, Kiyomi – Language Testing, 2006

The present study investigated the effects of reducing the number of options per item on psychometric characteristics of a Japanese EFL university entrance examination. A four-option multiple-choice reading test used for entrance screening at a university in Japan was later converted to a three-option version by eliminating the least frequently…

Descriptors: Foreign Countries, Psychometrics, Reading Tests, English (Second Language)

Crossvalidation of Item Response Curve Models Using TOEFL Data.

Peer reviewed

Boldt, Robert F. – Language Testing, 1992

The assumption called PIRC (proportional item response curve) was tested in which PIRC was used to predict item scores of selected examinees on selected items. Findings show approximate accuracies of prediction for PIRC, the three-parameter logist model, and a modified Rasch model. (12 references) (Author/LB)

Descriptors: Comparative Analysis, English (Second Language), Factor Analysis, Item Response Theory

Previous Page | Next Page »

Pages: 1 | 2

Bachman, Lyle F.	1
Batty, Aaron Olaf	1
Boldt, Robert F.	1
Brown, James Dean	1
Culligan, Brent	1
DeMauro, G.	1
Dudley, Albert	1
Eberharter, Kathrin	1
Ebner, Viktoria S.	1
Elgort, Irina	1
Esmat Babaii	1
Farshad Effatpanah	1
Guggenbichler, Elisa	1
Janssen, Gerriet	1
Kalender, Ilker	1
Kaya, Elif	1
Konrad, Eva	1
Kormos, Judit	1
Kozhevnikova, Liudmila	1
Kremmel, Benjamin	1
Löwenadler, John	1
Mizumoto, Atsushi	1
Mona Tabatabaee-Yazdi	1
Moser-Frötscher, Doris	1
More ▼