ERIC - Search Results

Publication Date

In 2026	0
Since 2025	4
Since 2022 (last 5 years)	40
Since 2017 (last 10 years)	81
Since 2007 (last 20 years)	146

Descriptor

Language Tests	198
Test Validity	156
Second Language Learning	153
English (Second Language)	119
Language Proficiency	86
Foreign Countries	68
Testing	58
Scores	52
Test Reliability	40
Construct Validity	38
Test Construction	37
Validity	36
Comparative Analysis	30
Correlation	30
Scoring	28
Second Language Instruction	28
Factor Analysis	27
Computer Assisted Testing	26
College Students	24
Oral Language	24
Test Items	24
Predictive Validity	19
Rating Scales	19
Speech Communication	19
Psychometrics	18
More ▼

Source

Language Testing

233

Publication Type

Journal Articles	233
Reports - Research	140
Reports - Evaluative	51
Reports - Descriptive	21
Opinion Papers	20
Information Analyses	10
Tests/Questionnaires	6
Speeches/Meeting Papers	2

Education Level

Higher Education	50
Postsecondary Education	32
Elementary Education	12
Secondary Education	9
Elementary Secondary Education	5
Junior High Schools	3
Middle Schools	3
High Schools	2
Early Childhood Education	1
Grade 12	1
Grade 7	1
Grade 8	1
Grade 9	1
Kindergarten	1
Primary Education	1
More ▼

Audience

Researchers	1
Teachers	1

Location

China	9
Japan	9
United Kingdom	7
Australia	6
Netherlands	5
Brazil	3
California	3
South Korea	3
United Kingdom (England)	3
United States	3
Canada	2
Germany	2
Israel	2
New Zealand	2
Taiwan	2
Thailand	2
Arizona	1
Belgium	1
California (San Francisco)	1
China (Guangzhou)	1
Cyprus	1
Europe	1
Finland	1
France	1
Hawaii	1
More ▼

Laws, Policies, & Programs

Race to the Top

Assessments and Surveys

Test of English as a Foreign…	32
International English…	10
Test of English for…	3
Graduate Record Examinations	2
Michigan Test of English…	2
Test of Written English	2
ACT Assessment	1
Clinical Evaluation of…	1
Edinburgh Handedness Inventory	1
English Proficiency Test	1
Program for International…	1
More ▼

What Works Clearinghouse Rating

Language Testing X

Showing 91 to 105 of 233 results Save | Export

Validity Arguments for Diagnostic Assessment Using Automated Writing Evaluation

Peer reviewed

Direct link

Chapelle, Carol A.; Cotos, Elena; Lee, Jooyoung – Language Testing, 2015

Two examples demonstrate an argument-based approach to validation of diagnostic assessment using automated writing evaluation (AWE). "Criterion"®, was developed by Educational Testing Service to analyze students' papers grammatically, providing sentence-level error feedback. An interpretive argument was developed for its use as part of…

Descriptors: Diagnostic Tests, Writing Evaluation, Automation, Test Validity

Effect of Genre on the Generalizability of Writing Scores

Peer reviewed

Direct link

Bouwer, Renske; Béguin, Anton; Sanders, Ted; van den Bergh, Huub – Language Testing, 2015

In the present study, aspects of the measurement of writing are disentangled in order to investigate the validity of inferences made on the basis of writing performance and to describe implications for the assessment of writing. To include genre as a facet in the measurement, we obtained writing scores of 12 texts in four different genres for each…

Descriptors: Writing Tests, Generalization, Scores, Writing Instruction

Determining Cloze Item Difficulty from Item and Passage Characteristics across Different Learner Backgrounds

Peer reviewed

Direct link

Trace, Jonathan; Brown, James Dean; Janssen, Gerriet; Kozhevnikova, Liudmila – Language Testing, 2017

Cloze tests have been the subject of numerous studies regarding their function and use in both first language and second language contexts (e.g., Jonz & Oller, 1994; Watanabe & Koyama, 2008). From a validity standpoint, one area of investigation has been the extent to which cloze tests measure reading ability beyond the sentence level.…

Descriptors: Cloze Procedure, Language Tests, Test Items, Item Analysis

Construct Validity in TOEFL iBT Speaking Tasks: Insights from Natural Language Processing

Peer reviewed

Direct link

Kyle, Kristopher; Crossley, Scott A.; McNamara, Danielle S. – Language Testing, 2016

This study explores the construct validity of speaking tasks included in the TOEFL iBT (e.g., integrated and independent speaking tasks). Specifically, advanced natural language processing (NLP) tools, MANOVA difference statistics, and discriminant function analyses (DFA) are used to assess the degree to which and in what ways responses to these…

Descriptors: Construct Validity, Natural Language Processing, Speech Skills, Speech Acts

Validating the Slovenian National Alignment to CEFR: The Case of the B2 Reading Comprehension Examination in English

Peer reviewed

Direct link

Ilc, Gašper; Stopar, Andrej – Language Testing, 2015

The paper examines the results of the CEFR alignment project for the Slovenian national examinations in English. The authors aim to validate externally the standard-setting procedures by adopting a socio-cognitive model of validation (Khalifa & Weir, 2009; Weir, 2005) to analyse the scoring, context and cognitive validity of three reading…

Descriptors: Foreign Countries, English (Second Language), Second Language Instruction, Second Language Learning

Test Review: The Modern Language Aptitude Test (Paper-and-Pencil Version)

Peer reviewed

Direct link

Sasaki, Miyuki – Language Testing, 2012

The Modern Language Aptitude Test (Paper-and-Pencil Version, henceforth, the MLAT) measures "an individual's ability to learn a foreign language." It targets English-speaking adults (over Grade 9) who are literate. The test has only one form, which has not changed since it was first published by the Psychological Corporation in 1959. The test can…

Descriptors: Aptitude Tests, Test Reviews, Rewards, Acoustics

Determining the Scoring Validity of a Co-Constructed CEFR-Based Rating Scale

Peer reviewed

Direct link

Deygers, Bart; Van Gorp, Koen – Language Testing, 2015

Considering scoring validity as encompassing both reliable rating scale use and valid descriptor interpretation, this study reports on the validation of a CEFR-based scale that was co-constructed and used by novice raters. The research questions this paper wishes to answer are (a) whether it is possible to construct a CEFR-based rating scale with…

Descriptors: Rating Scales, Scoring, Validity, Interrater Reliability

A Study on the Impact of Fatigue on Human Raters When Scoring Speaking Responses

Peer reviewed

Direct link

Ling, Guangming; Mollaun, Pamela; Xi, Xiaoming – Language Testing, 2014

The scoring of constructed responses may introduce construct-irrelevant factors to a test score and affect its validity and fairness. Fatigue is one of the factors that could negatively affect human performance in general, yet little is known about its effects on a human rater's scoring quality on constructed responses. In this study, we compared…

Descriptors: Evaluators, Fatigue (Biology), Scoring, Performance

Grounding Lexical Diversity in Human Judgments

Peer reviewed

Direct link

Jarvis, Scott – Language Testing, 2017

The present study discusses the relevance of measures of lexical diversity (LD) to the assessment of learner corpora. It also argues that existing measures of LD, many of which have become specialized for use with language corpora, are fundamentally measures of lexical repetition, are based on an etic perspective of language, and lack construct…

Descriptors: Computational Linguistics, English (Second Language), Second Language Learning, Native Speakers

Test Fairness: A Response

Peer reviewed

Direct link

Davies, Alan – Language Testing, 2010

This article presents the author's response to Xiaoming Xi's paper titled "How do we go about investigating test fairness?" In the paper, Xi offers "a means to fully integrate fairness investigations and practice". Given the current importance accorded to fairness in the language testing community, Xi makes a case for viewing fairness as an aspect…

Descriptors: Investigations, Testing, Language Tests, Validity

Kane, Validity and Soundness

Peer reviewed

Direct link

Davies, Alan – Language Testing, 2012

In this article, the author begins by discussing four challenges on the concept of validity. These challenges are: (1) the appeal to logic and syllogistic reasoning; (2) the claim of reliability; (3) the local and the universal; and (4) the unitary and the divisible. In language testing validity cannot be achieved directly but only through a…

Descriptors: Language Tests, Test Validity, Test Reliability, Testing

Removing Bias towards World Englishes: The Development of a Rater Attitude Instrument Using Indian English as a Stimulus

Peer reviewed

Direct link

Hsu, Tammy Huei-Lien – Language Testing, 2016

This study explores the attitudes of raters of English speaking tests towards the global spread of English and the challenges in rating speakers of Indian English in descriptive speaking tasks. The claims put forward by language attitude studies indicate a validity issue in English speaking tests: listeners tend to hold negative attitudes towards…

Descriptors: Evaluators, Language Tests, English (Second Language), Second Language Learning

Validity Argument for Assessing L2 Pragmatics in Interaction Using Mixed Methods

Peer reviewed

Direct link

Youn, Soo Jung – Language Testing, 2015

This study investigates the validity of assessing L2 pragmatics in interaction using mixed methods, focusing on the evaluation inference. Open role-plays that are meaningful and relevant to the stakeholders in an English for Academic Purposes context were developed for classroom assessment. For meaningful score interpretations and accurate…

Descriptors: Second Language Learning, Pragmatics, Validity, Mixed Methods Research

Common Educational Proficiency Assessment (CEPA) in English

Peer reviewed

Direct link

Coombe, Christine; Davidson, Peter – Language Testing, 2014

The Common Educational Proficiency Assessment (CEPA) is a large-scale, high-stakes, English language proficiency/placement test administered in the United Arab Emirates to Emirati nationals in their final year of secondary education or Grade 12. The purpose of the CEPA is to place students into English classes at the appropriate government…

Descriptors: Language Tests, High Stakes Tests, English (Second Language), Second Language Learning

Confidence Scoring of Speaking Performance: How Does Fuzziness become Exact?

Peer reviewed

Direct link

Jin, Tan; Mak, Barley; Zhou, Pei – Language Testing, 2012

The fuzziness of assessing second language speaking performance raises two difficulties in scoring speaking performance: "indistinction between adjacent levels" and "overlap between scales". To address these two problems, this article proposes a new approach, "confidence scoring", to deal with such fuzziness, leading to "confidence" scores between…

Descriptors: Speech Communication, Scoring, Test Interpretation, Second Language Learning

« Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 16

Bachman, Lyle F.	6
Chapelle, Carol A.	5
Fulcher, Glenn	5
Henning, Grant	5
Yan, Xun	5
Davies, Alan	4
McNamara, Tim	4
Alderson, J. Charles	3
Aryadoust, Vahid	3
Cho, Yeonsuk	3
Davidson, Fred	3
Ginther, April	3
Knoch, Ute	3
Roever, Carsten	3
Schmitt, Norbert	3
Shohamy, Elana	3
Stansfield, Charles W.	3
Xi, Xiaoming	3
August, Diane	2
Bae, Jungok	2
Beglar, David	2
Bridgeman, Brent	2
Brown, James Dean	2
Brunfaut, Tineke	2
More ▼