ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	9

Descriptor

English (Second Language)	15
Reliability	15
Second Language Learning	13
Language Tests	12
Scores	9
Validity	6
Writing Tests	5
Correlation	4
Scoring	4
Second Language Instruction	4
Statistical Analysis	4
Test Construction	4
Computer Assisted Testing	3
Foreign Countries	3
Generalizability Theory	3
Accuracy	2
Classification	2
College Students	2
Comparative Analysis	2
Essay Tests	2
Evaluators	2
Gender Differences	2
Language Proficiency	2
Likert Scales	2
Listening Comprehension Tests	2
More ▼

Source

ETS Research Report Series	4
Applied Linguistics	1
Cogent Education	1
International Journal of…	1
JALT CALL Journal	1
Language Assessment Quarterly	1
Language Teaching Research…	1
Online Submission	1
ProQuest LLC	1
TESOL Quarterly	1

Publication Type

Journal Articles	12
Reports - Research	12
Numerical/Quantitative Data	2
Speeches/Meeting Papers	2
Tests/Questionnaires	2
Dissertations/Theses -…	1
Guides - Classroom - Teacher	1
Reports - Evaluative	1

Education Level

Higher Education	4
Postsecondary Education	4
High Schools	2
Secondary Education	2
Elementary Education	1
Grade 10	1
Grade 11	1
Grade 12	1
Grade 6	1
Grade 7	1
Grade 8	1
Grade 9	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
More ▼

Audience

Location

Australia	1
Canada	1
Hong Kong	1
Iran	1
Mexico	1
Saudi Arabia	1
Taiwan	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	15
International English…	2
Graduate Management Admission…	1
Strategy Inventory for…	1

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Using Statistical Transformation Methods to Explore Speech Perception Scale Lengths

Peer reviewed
PDF on ERIC

Download full text

Kermad, Alyssa; Bogorevich, Valeria – Language Teaching Research Quarterly, 2022

The practice of second language (L2) speech perception has traditionally relied on equal-interval perceptual scales and novice listeners' (NLs) impressionistic judgments of constructs such as accentedness and comprehensibility (Munro & Derwing, 2011). However, issues have surfaced with respect to how well NLs can use these scales, whether they…

Descriptors: Speech Communication, Second Language Learning, Intelligibility, Rating Scales

Analysis of IELTS and TOEFL Reading and Listening Tests in Terms of Revised Bloom's Taxonomy

Peer reviewed

Direct link

Baghaei, Samira; Bagheri, Mohammad Sadegh; Yamini, Mortaza – Cogent Education, 2020

The main purpose of this quantitative-qualitative content analysis study was to compare IELTS and TOEFL listening and reading tests based on the representation of the learning objectives of Revised Bloom's taxonomy. To this end, 12 Academic IELTS listening and reading tests and 12 TOEFL iBT listening and reading tests were analyzed qualitatively…

Descriptors: Second Language Learning, English (Second Language), Language Tests, Reading Tests

Developing and Validating Band Levels and Descriptors for Reporting Overall Examinee Performance

Peer reviewed

Direct link

Papageorgiou, Spiros; Xi, Xiaoming; Morgan, Rick; So, Youngsoon – Language Assessment Quarterly, 2015

This study presents the development and empirical validation of score levels and descriptors specifically designed for reporting purposes to provide test takers with more than just a number on a score scale. In the context of a test primarily intended for 11- to 15-year-old students learning English as a second/foreign language, the study examined…

Descriptors: Scores, Validity, Scaling, Classification

Enhancing the Interpretability of the Overall Results of an International Test of English-Language Proficiency

Peer reviewed

Direct link

Papageorgiou, Spiros; Morgan, Rick; Becker, Valerie – International Journal of Testing, 2015

The purpose of this study was to enhance the meaning of the scores of an English-language test by developing performance levels and descriptors for reporting overall test performance. The levels and descriptors were intended to accompany the total scale scores of TOEFL Junior® Standard, an international test of English as a second/foreign…

Descriptors: Language Proficiency, Language Tests, English (Second Language), Second Language Learning

Using Different Types of Dictionaries for Improving EFL Reading Comprehension and Vocabulary Learning

Peer reviewed
PDF on ERIC

Download full text

Alharbi, Majed A. – JALT CALL Journal, 2016

This study investigated the effects of monolingual book dictionaries, popup dictionaries, and type-in dictionaries on improving reading comprehension and vocabulary learning in an EFL program. An experimental design involving four groups and a post-test was chosen for the experiment: (1) pop-up dictionary (experimental group 1); (2) type-in…

Descriptors: English (Second Language), Reading Comprehension, Vocabulary Development, Dictionaries

The Psychometric Analysis of the Persian Version of the Strategy Inventory for Language Learning of Rebecca L. Oxford

Download full text

Fazeli, Seyed Hossein – Online Submission, 2012

The current study aims to analyze the psychometric qualities of the Persian adapted version of Strategy Inventory for Language Learning (SILL) developed by Rebecca L. Oxford (1990). Three instruments were used: Persian adapted version of SILL, a Background Questionnaire, and Test of English as a Foreign Language. Two hundred and thirteen Iranian…

Descriptors: Psychometrics, Measures (Individuals), Indo European Languages, Females

Rater Expertise in a Second Language Speaking Assessment: The Influence of Training and Experience

Direct link

Davis, Lawrence Edward – ProQuest LLC, 2012

Speaking performance tests typically employ raters to produce scores; accordingly, variability in raters' scoring decisions has important consequences for test reliability and validity. One such source of variability is the rater's level of expertise in scoring. Therefore, it is important to understand how raters' performance is influenced by…

Descriptors: Evaluators, Expertise, Scores, Second Language Learning

Toward Automated Multi-Trait Scoring of Essays: Investigating Links among Holistic, Analytic, and Text Feature Scores

Peer reviewed

Direct link

Lee, Yong-Won; Gentile, Claudia; Kantor, Robert – Applied Linguistics, 2010

The main purpose of the study was to investigate the distinctness and reliability of analytic (or multi-trait) rating dimensions and their relationships to holistic scores and "e-rater"[R] essay feature variables in the context of the TOEFL[R] computer-based test (TOEFL CBT) writing assessment. Data analyzed in the study were holistic…

Descriptors: Writing Evaluation, Writing Tests, Scoring, Essays

Score Reliability as an Essential Prerequisite for Validating New Writing and Speaking Tasks for TOEFL.

Lee, Yong-Won; Kantor, Robert; Mollaun, Pam – 2002

This paper reports the results of generalizability theory (G) analyses done for new writing and speaking tasks for the Test of English as a Foreign Language (TOEFL). For writing, a special focus was placed on evaluating the impact on the reliability of the number of raters (or ratings) per essay (one or two) and the number of tasks (one, two, or…

Descriptors: English (Second Language), Generalizability Theory, Reliability, Scores

Language Tests and ESL Teaching. Examining Standardized Test Content: Some Advice for Teachers.

Peer reviewed

DeVincenzi, Felicia – TESOL Quarterly, 1995

Argues that teachers need to become "informed consumers" of standardized tests in order to influence decisions about test use and about ways to help students perform at their best. Six strategies for considering the content of a test form are presented. (LR)

Descriptors: Content Analysis, English (Second Language), Evaluation, Guidelines

Dependability of New ESL Writing Test Scores: Evaluating Prototype Tasks and Alternative Rating Schemes. TOEFL® Monograph Series. MS-31. ETS RR-05-14

Peer reviewed
PDF on ERIC

Download full text

Lee, Yong-Won; Kantor, Robert – ETS Research Report Series, 2005

Possible integrated and independent tasks were pilot tested for the writing section of a new generation of TOEFL® (Test of English as a Foreign Language™) examination. This study examines the impact of various rating designs as well as the impact of the number of tasks and raters on the reliability of writing scores based on integrated and…

Descriptors: Language Tests, English (Second Language), Second Language Learning, Writing Tests

Score Dependability of the Writing and Speaking Sections of New TOEFL.

Lee, Yong-Won; Kantor, Robert; Mollaun, Pam – 2002

This study examines the score dependability of writing and speaking assessments from the Test of English as a Foreign Language (TOEFL) from the perspectives of univariate and multivariate generalizability theory (G-theory) and presents the findings of three separate G-theory studies. For writing, the focus was on evaluating the impact on…

Descriptors: Ability, English (Second Language), Generalizability Theory, Item Bias

Confidence and Cognitive Test Performance. Research Report. ETS RR-07-03

Peer reviewed
PDF on ERIC

Download full text

Stankov, Lazar; Lee, Jihyun – ETS Research Report Series, 2007

This paper examines the nature of confidence in relation to cognitive abilities, personality traits, and metacognition. Confidence was measured as it was expressed in answers to each test item during the administration of reading and listening sections of the TOEFL® iBT. The confidence scores were correlated with the accuracy scores from the TOEFL…

Descriptors: English (Second Language), Grade Point Average, High Schools, Personality Traits

An Investigation of the Impact of Composition Medium on the Quality of TOEFL Writing Scores. TOEFL® Research Report. RR-72. ETS RR-04-29

Peer reviewed
PDF on ERIC

Download full text

Wolfe, Edward W.; Manalo, Jonathan R. – ETS Research Report Series, 2005

This study examined scores from 133,906 operationally scored Test of English as a Foreign Language™ (TOEFL®) essays to determine whether the choice of composition medium has any impact on score quality for subgroups of test-takers. Results of analyses demonstrate that (a) scores assigned to word-processed essays are slightly more reliable than…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Scores

Automated Essay Scoring with e-rater® v.2.0. Research Report. ETS RR-04-45

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Burstein, Jill – ETS Research Report Series, 2005

The e-rater® system has been used by ETS for automated essay scoring since 1999. This paper describes a new version of e-rater (v.2.0) that differs from the previous one (v.1.3) with regard to the feature set and model building approach. The paper describes the new version, compares the new and previous versions in terms of performance, and…

Descriptors: Essay Tests, Automation, Scoring, Comparative Analysis

Kantor, Robert	4
Lee, Yong-Won	4
Mollaun, Pam	2
Morgan, Rick	2
Papageorgiou, Spiros	2
Alharbi, Majed A.	1
Attali, Yigal	1
Baghaei, Samira	1
Bagheri, Mohammad Sadegh	1
Becker, Valerie	1
Bogorevich, Valeria	1
Burstein, Jill	1
Davis, Lawrence Edward	1
DeVincenzi, Felicia	1
Fazeli, Seyed Hossein	1
Gentile, Claudia	1
Kermad, Alyssa	1
Lee, Jihyun	1
Manalo, Jonathan R.	1
So, Youngsoon	1
Stankov, Lazar	1
Wolfe, Edward W.	1
Xi, Xiaoming	1
Yamini, Mortaza	1
More ▼