ERIC - Search Results

Publication Date

In 2026	0
Since 2025	6
Since 2022 (last 5 years)	21
Since 2017 (last 10 years)	52
Since 2007 (last 20 years)	93

Descriptor

Language Tests	87
Test Reliability	71
Second Language Learning	69
English (Second Language)	54
Foreign Countries	46
Test Validity	42
Interrater Reliability	39
Language Proficiency	37
Scores	32
Evaluators	22
Comparative Analysis	21
Correlation	20
Scoring	20
Test Construction	19
Item Response Theory	18
Testing	18
Rating Scales	16
Reliability	15
Writing Evaluation	15
Oral Language	13
Writing Tests	12
High Stakes Tests	11
Reading Comprehension	11
Second Language Instruction	11
Secondary School Students	11
More ▼

Source

Language Testing

120

Publication Type

Journal Articles	120
Reports - Research	84
Reports - Evaluative	23
Reports - Descriptive	9
Information Analyses	6
Tests/Questionnaires	5
Opinion Papers	3
Speeches/Meeting Papers	1

Education Level

Higher Education	23
Postsecondary Education	16
Secondary Education	12
Elementary Education	6
Elementary Secondary Education	4
Junior High Schools	3
Middle Schools	3
High Schools	2
Adult Education	1
Early Childhood Education	1
Grade 12	1
Grade 6	1
Grade 7	1
Intermediate Grades	1
Kindergarten	1
Primary Education	1
More ▼

Audience

Location

China	7
Netherlands	7
Finland	4
Germany	4
Australia	3
Japan	3
South Korea	3
Canada	2
France	2
Hong Kong	2
Taiwan	2
United Kingdom	2
Arizona	1
Austria	1
Bulgaria	1
China (Guangzhou)	1
Colombia	1
Denmark	1
Europe	1
Georgia	1
Hawaii	1
Illinois	1
Illinois (Urbana)	1
India	1
Indiana	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	10
ACTFL Oral Proficiency…	1
English Proficiency Test	1
Graduate Record Examinations	1
International English…	1
Peabody Picture Vocabulary…	1
Test of Written English	1

What Works Clearinghouse Rating

Language Testing X

Showing 91 to 105 of 120 results Save | Export

A Vocabulary-Size Test of Controlled Productive Ability.

Peer reviewed

Laufer, Batia; Nation, Paul – Language Testing, 1999

Investigated the reliability, validity, and practicality of a controlled production measure of vocabulary, consisting of items from five frequency levels and using a completion-item format. Two equivalent test forms were compared. The test was found to be useful in distinguishing between different proficiency groups. (Author/MSE)

Descriptors: Difficulty Level, Language Tests, Second Languages, Test Construction

Better Theory for Better Tests?

Peer reviewed

Raatz, Ulrich – Language Testing, 1985

Argues that classical test theory cannot be used at the item level on "authentic" language tests. However, if the total score is derived by adding the scores of a number of different and independent parts, test reliability can be estimated. Suggests using the Classical Latent Additives model to examine test-part homogeneity. (Author/SED)

Descriptors: Item Analysis, Latent Trait Theory, Models, Second Language Learning

Statistical Aspects of Reliability in Language Testing.

Peer reviewed

Krzanowski, Wojtek J.; Woods, Anthony J. – Language Testing, 1984

Discusses the concept of reliability in language testing and considers several simple ANOVA (analysis of variance) models which can be used to define and estimate reliability coefficients. Summarizes the main statistical results associated with the commonly used measurements of reliability. Presents results likely to be of use to language testers.…

Descriptors: Analysis of Variance, Language Skills, Language Tests, Sampling

Short-Cut Estimators of Criterion-Referenced Test Consistency.

Peer reviewed

Brown, James Dean – Language Testing, 1990

Presents simplified methods for deriving estimates of the consistency of criterion-referenced, English-as-a-Second-Language tests, including (1) the threshold loss agreement approach using agreement or kappa coefficients, (2) the squared-error loss agreement approach using the phi(lambda) dependability approach, and (3) the domain score…

Descriptors: Criterion Referenced Tests, English (Second Language), Language Tests, Second Language Learning

The Assessment of Writing Ability: Expert Readers versus Lay Readers.

Peer reviewed

Schoonen, Rob; And Others – Language Testing, 1997

Reports on three studies conducted in the Netherlands about the reading reliability of lay and expert readers in rating content and language usage of students' writing performances in three kinds of writing assignments. Findings reveal that expert readers are more reliable in rating usage, whereas both lay and expert readers are reliable raters of…

Descriptors: Foreign Countries, Interrater Reliability, Language Usage, Models

Interviewer Variation and the Co-construction of Speaking Proficiency.

Peer reviewed

Brown, Annie – Language Testing, 2003

Examines the question of variation among interviewers of oral language proficiency interviews in the ways that they elicit demonstrations of communicative ability and the impact of this variation on candidate performance and raters' perceptions of candidate ability. A discourse analysis of two interviews involving the same candidate with two…

Descriptors: Discourse Analysis, Interrater Reliability, Interviews, Language Proficiency

Development and Validation of a Scale to Measure Test-Wiseness in EFL/ESL Reading Test Takers.

Peer reviewed

Allan, Alistair – Language Testing, 1992

The design of a valid and reliable test of test-wiseness is reported: a 33-item multiple-choice instrument with 4 subscales trialed with several groups of English-as-a-Second-Language students. Findings indicate differential skills in test-taking; some learner scores are influenced by skills that are not the focus of the test. (13 references)…

Descriptors: English (Second Language), Language Research, Language Tests, Multiple Choice Tests

Accounting for Nonsystematic Error in Performance Ratings.

Peer reviewed

Henning, Grant – Language Testing, 1996

Analyzes simulated performance ratings on a six-point scale by two independent raters to account for nonsystematic error in performance ratings. Results suggest that rater agreement or covariance is not always a dependable estimate of score reliability and that the practice of seeking additional raters for adjudication of discrepant ratings is not…

Descriptors: Correlation, Error Patterns, Interrater Reliability, Language Tests

"vocd": A Theoretical and Empirical Evaluation

Peer reviewed

Direct link

McCarthy, Philip M.; Jarvis, Scott – Language Testing, 2007

A reliable index of lexical diversity (LD) has remained stubbornly elusive for over 60 years. Meanwhile, researchers in fields as varied as "stylistics," "neuropathology," "language acquisition," and even "forensics" continue to use flawed LD indices--often ignorant that their results are questionable and in…

Descriptors: Second Language Learning, English (Second Language), Foreign Countries, Adolescents

Developing a Pragmatics Test for Chinese EFL Learners

Peer reviewed

Direct link

Liu, Jianda – Language Testing, 2007

Pragmatic proficiency has been incorporated in the EFL teaching and testing syllabi in China, but the corresponding tests still focus on linguistic competence. The gap between the teaching and testing is mainly due to the lack of generally accepted measures of communicative abilities such as pragmatic competence. This study developed a…

Descriptors: Linguistic Competence, Speech Acts, Testing, Foreign Countries

Evaluating Rater Responses to an Online Training Program for L2 Writing Assessment

Peer reviewed

Direct link

Elder, Catherine; Barkhuizen, Gary; Knoch, Ute; von Randow, Janet – Language Testing, 2007

The use of online rater self-training is growing in popularity and has obvious practical benefits, facilitating access to training materials and rating samples and allowing raters to reorient themselves to the rating scale and self monitor their behaviour at their own convenience. However there has thus far been little research into rater…

Descriptors: Writing Evaluation, Writing Tests, Scoring Rubrics, Rating Scales

Dependability of Scores for a New ESL Speaking Assessment Consisting of Integrated and Independent Tasks

Peer reviewed

Direct link

Lee, Yong-Won – Language Testing, 2006

A multitask speaking measure consisting of both integrated and independent tasks is expected to be an important component of a new version of the TOEFL test. This study considered two critical issues concerning score dependability of the new speaking measure: How much would the score dependability be impacted by (1) combining scores on different…

Descriptors: Language Tests, Second Language Learning, English (Second Language), Generalizability Theory

Comparative Analyses of English as a Second Language Reading Comprehension Data: Classical Test Theory and Latent Trait Measurement.

Peer reviewed

Perkins, Kyle; Miller, Leah D. – Language Testing, 1984

Describes a study which submitted data from a multiple-choice English as a second language reading comprehension test to classical test theory item analysis and latent trait measurement. The purpose was to identify weak items and to compare the number of weak items indicated by the two different approaches. (SED)

Descriptors: English (Second Language), Language Tests, Latent Trait Theory, Reading Comprehension

Using GENOVA and FACETS to Set Multiple Standards on Performance Assessment for Certification in Medical Translation from Japanese into English

Peer reviewed

Direct link

Kozaki ,Y. – Language Testing, 2004

This article presents a standard-setting procedure for performance assessment in a foreign language, through which some of the major problems facing performance assessment in criterion-referenced testing can be addressed. The procedure, which was geared to revealing and accommodating inter-judge variability, employed the synergy of multiple…

Descriptors: Data Analysis, Testing, Performance Tests, Generalizability Theory

Testing Methods, Testing Consequences: Are They Ethical? Are They Fair.

Peer reviewed

Shohamy, Elana – Language Testing, 1997

Argues that language tests employing methods not fair to all test takers are unethical. Ways of reducing sources of unfairness in language testing is discussed. (15 references) (Author/CK)

Descriptors: Academic Achievement, Change Strategies, Ethics, Language Proficiency

« Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

Knoch, Ute	4
Alderson, J. Charles	2
Aryadoust, Vahid	2
Attali, Yigal	2
Brown, James Dean	2
Chapelle, Carol A.	2
Deygers, Bart	2
Elder, Catherine	2
Haug, Tobias	2
Iasonas Lamprianou	2
Jarvis, Scott	2
Kunnan, Antony John	2
Lee, Yong-Won	2
Lin, Chih-Kai	2
Reeta Neittaanmäki	2
Schoonen, Rob	2
Shin, Sun-Young	2
Stansfield, Charles W.	2
Wind, Stefanie A.	2
Winke, Paula	2
Yan, Xun	2
de Jong, Nivja H.	2
Alanen, Riikka	1
Allan, Alistair	1
More ▼