ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	15

Descriptor

Program Effectiveness	16
Second Language Learning	16
English (Second Language)	11
Language Tests	10
Foreign Countries	6
Language Proficiency	5
Second Language Instruction	5
Comparative Analysis	4
Oral Language	4
Scores	4
Statistical Analysis	4
Writing Tests	4
Effect Size	3
Grammar	3
High Stakes Tests	3
Listening Comprehension Tests	3
Rating Scales	3
Reading Tests	3
Student Evaluation	3
Testing	3
Academic Achievement	2
College Students	2
Construct Validity	2
Experimental Groups	2
Factor Analysis	2
More ▼

Source

Language Testing

Publication Type

Journal Articles	16
Reports - Evaluative	10
Reports - Research	6

Education Level

Higher Education	5
Elementary Secondary Education	2
Elementary Education	1
Grade 6	1

Audience

Location

California	2
Canada	2
Japan	2
South Korea	2
United Kingdom	2
Brazil	1
Burma	1
Cambodia	1
Cameroon	1
China (Shanghai)	1
Colombia	1
Europe	1
France	1
Indonesia	1
Iran	1
New Zealand	1
Poland	1
Singapore	1
Turkmenistan	1
United States	1
More ▼

Laws, Policies, & Programs

Elementary and Secondary…	1
Lau v Nichols	1
No Child Left Behind Act 2001	1
Race to the Top	1

Assessments and Surveys

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

"How Do Raters Learn to Rate?" Many-Facet Rasch Modeling of Rater Performance over the Course of a Rater Certification Program

Peer reviewed

Direct link

Yan, Xun; Chuang, Ping-Lin – Language Testing, 2023

This study employed a mixed-methods approach to examine how rater performance develops during a semester-long rater certification program for an English as a Second Language (ESL) writing placement test at a large US university. From 2016 to 2018, we tracked three groups of novice raters (n = 30) across four rounds in the certification program.…

Descriptors: Evaluators, Interrater Reliability, Item Response Theory, Certification

Distinguishing Features in Scoring L2 Chinese Speaking Performance: How Do They Work?

Peer reviewed

Direct link

Jin, Tan; Mak, Barley – Language Testing, 2013

For Chinese as a second language (L2 Chinese), there has been little research into "distinguishing features" (Fulcher, 1996; Iwashita et al., 2008) used in scoring L2 Chinese speaking performance. The study reported here investigates the relationship between the distinguishing features of L2 Chinese spoken performances and the scores…

Descriptors: Second Languages, Second Language Learning, Chinese, Holistic Evaluation

Standards-Based Classroom Assessments of English Proficiency: A Review of Issues, Current Developments, and Future Directions for Research

Peer reviewed

Direct link

Llosa, Lorena – Language Testing, 2011

With the United States' adoption of a standards-based approach to education, most attention has focused on the large-scale, high-stakes assessments intended to measure students' mastery of standards for accountability purposes. Less attention has been paid to the role of standards-based assessments in the classroom. The purpose of this paper is to…

Descriptors: Urban Schools, Student Evaluation, Language Tests, Second Language Learning

The Effects of Self-Assessment among Young Learners of English

Peer reviewed

Direct link

Butler, Yuko Goto; Lee, Jiyoon – Language Testing, 2010

This study examined the effectiveness of self-assessment among 254 young learners of English as a foreign language. This study looked at 6th grade students in South Korea, who were asked to perform self-assessments on a regular basis for a semester during their English classes. The students improved their ability to self-assess their performance…

Descriptors: Second Language Learning, Program Effectiveness, Effect Size, Foreign Countries

A Meta-Analysis of Test Format Effects on Reading and Listening Test Performance: Focus on Multiple-Choice and Open-Ended Formats

Peer reviewed

Direct link

In'nami, Yo; Koizumi, Rie – Language Testing, 2009

A meta-analysis was conducted on the effects of multiple-choice and open-ended formats on L1 reading, L2 reading, and L2 listening test performance. Fifty-six data sources located in an extensive search of the literature were the basis for the estimates of the mean effect sizes of test format effects. The results using the mixed effects model of…

Descriptors: Test Format, Listening Comprehension Tests, Multiple Choice Tests, Program Effectiveness

The Effect of the Use of Video Texts on ESL Listening Test-Taker Performance

Peer reviewed

Direct link

Wagner, Elvis – Language Testing, 2010

Video is widely used in the teaching of L2 listening, and SLA researchers have argued that the visual components of spoken texts are useful for the listener in comprehending aural information. Yet video texts are rarely used on tests of L2 listening ability, perhaps in part due to the belief that including the visual channel involves assessing…

Descriptors: Experimental Groups, Control Groups, Listening Comprehension, Quasiexperimental Design

Test Review: BEST Plus Spoken Language Test

Peer reviewed

Direct link

Van Moere, Alistair – Language Testing, 2009

The purpose of BEST Plus is to assess the ability to understand and use unprepared, conversational, everyday language within topic areas generally covered in adult education courses. It is one of several standardized assessments approved by the National Reporting System (NRS, 2008), which is the accountability system for federally funded ESL and…

Descriptors: Standardized Tests, Oral Language, Language Tests, Adult Education

Testing English Language Learners under No Child Left Behind

Peer reviewed

Direct link

Bunch, Michael B. – Language Testing, 2011

Title III of Public Law 107-110 (No Child Left Behind; NCLB) provided for creation of assessments of English language learners (ELLs) and established, through the Enhanced Assessment Grant program, a platform from which four consortia of states developed ELL tests aligned to rigorous statewide content standards. Those four tests (ACCESS for ELLs,…

Descriptors: Test Items, Student Evaluation, Federal Legislation, Formative Evaluation

Washback of an Oral Assessment System in the EFL Classroom

Peer reviewed

Direct link

Munoz, Ana P.; Alvarez, Marta E. – Language Testing, 2010

This article reports the results of a research study to determine the washback effect of an oral assessment system on some areas of the teaching and learning of English as a Foreign Language (EFL). The research combined quantitative and qualitative research methods within a comparative study between an experimental group and a comparison group.…

Descriptors: Experimental Groups, Qualitative Research, Student Surveys, Program Effectiveness

Investigating Differences in the Writing Performance of International and Generation 1.5 Students

Peer reviewed

Direct link

di Gennaro, Kristen – Language Testing, 2009

Practitioners working closely with second language (L2) writers in the US recognize at least two types of L2 students: international (IL2) and Generation 1.5 (G1.5) students. Some argue that specific differences in each group's writing performance are evident (cf. Harklau, 2003; Reid, 2006); however, investigations into observable and measurable…

Descriptors: English (Second Language), Second Language Learning, Student Placement, Writing (Composition)

Rater Bias Patterns in an EFL Writing Assessment

Peer reviewed

Direct link

Schaefer, Edward – Language Testing, 2008

The present study employed multi-faceted Rasch measurement (MFRM) to explore the rater bias patterns of native English-speaker (NES) raters when they rate EFL essays. Forty NES raters rated 40 essays written by female Japanese university students on a single topic adapted from the TOEFL Test of Written English (TWE). The essays were assessed using…

Descriptors: Writing Evaluation, Writing Tests, Program Effectiveness, Essays

Interacting in Pairs in a Test of Oral Proficiency: Co-Constructing a Better Performance

Peer reviewed

Direct link

Brooks, Lindsay – Language Testing, 2009

This study, framed within sociocultural theory, examines the interaction of adult ESL test-takers in two tests of oral proficiency: one in which they interacted with an examiner (the individual format) and one in which they interacted with another student (the paired format). The data for the eight pairs in this study were drawn from a larger…

Descriptors: Testing, Rating Scales, Program Effectiveness, Interaction

Bilingual Dictionaries in Tests of L2 Writing Proficiency: Do They Make a Difference?

Peer reviewed

Direct link

East, Martin – Language Testing, 2007

Whether test takers should be allowed access to dictionaries when taking L2 tests has been the subject of debate for a good number of years. Opinions differ according to how the test construct is understood and whether the underlying value system favours process-orientated assessment for learning, with its concern to elicit the test takers' best…

Descriptors: Writing Tests, Reading Tests, Program Effectiveness, Dictionaries

Validating a Standards-Based Classroom Assessment of English Proficiency: A Multitrait-Multimethod Approach

Peer reviewed

Direct link

Llosa, Lorena – Language Testing, 2007

The use of standards-based classroom assessments to test English learners' language proficiency is increasingly prevalent in the United States and many other countries. In a large urban school district in California, for example, a classroom assessment is used to make high-stakes decisions about English learners' progress from one level to the…

Descriptors: Urban Schools, Multitrait Multimethod Techniques, Standardized Tests, Construct Validity

Construct Validation of Analytic Rating Scales in a Speaking Assessment: Reporting a Score Profile and a Composite

Peer reviewed

Direct link

Sawaki, Yasuyo – Language Testing, 2007

This is a construct validation study of a second language speaking assessment that reported a language profile based on analytic rating scales and a composite score. The study addressed three key issues: score dependability, convergent/discriminant validity of analytic rating scales and the weighting of analytic ratings in the composite score.…

Descriptors: Generalizability Theory, Speech Communication, Student Placement, Construct Validity

Previous Page | Next Page »

Pages: 1 | 2

Llosa, Lorena	2
Alvarez, Marta E.	1
Brooks, Lindsay	1
Bunch, Michael B.	1
Butler, Yuko Goto	1
Chuang, Ping-Lin	1
East, Martin	1
Green, Rita	1
In'nami, Yo	1
Jin, Tan	1
Koizumi, Rie	1
Lee, Jiyoon	1
Mak, Barley	1
Munoz, Ana P.	1
Sawaki, Yasuyo	1
Schaefer, Edward	1
Van Moere, Alistair	1
Wagner, Elvis	1
Wall, Dianne	1
Yan, Xun	1
di Gennaro, Kristen	1
More ▼