ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	5
Since 2007 (last 20 years)	18

Descriptor

Computer Assisted Testing	23
English (Second Language)	22
Language Tests	22
Second Language Learning	18
Interrater Reliability	12
Scores	12
Scoring	11
Evaluators	10
Test Reliability	10
Correlation	9
Oral Language	9
Test Validity	7
Writing Tests	6
Essay Tests	5
Factor Analysis	5
Language Proficiency	5
Statistical Analysis	5
Essays	4
Foreign Countries	4
Models	4
Test Format	4
Accuracy	3
Computer Software	3
Cues	3
Internet	3
More ▼

Source

ETS Research Report Series	9
Language Testing	4
Online Submission	2
Applied Linguistics	1
Education and Information…	1
Educational Testing Service	1
Language Assessment Quarterly	1
ProQuest LLC	1

Publication Type

Journal Articles	17
Reports - Research	16
Reports - Evaluative	3
Speeches/Meeting Papers	3
Dissertations/Theses -…	2
Reports - Descriptive	2
Tests/Questionnaires	1

Education Level

Higher Education	3
Postsecondary Education	2
Elementary Education	1
High Schools	1
Secondary Education	1

Audience

Researchers

Location

Germany	2
Canada	1
China	1
France	1
Italy	1
Switzerland	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	23
Computer Attitude Scale	1
Graduate Record Examinations	1

What Works Clearinghouse Rating

Showing 1 to 15 of 23 results Save | Export

Computerized Testing in Reading Comprehension Skill: Investigating Score Interchangeability, Item Review, Age and Gender Stereotypes, ICT Literacy and Computer Attitudes

Peer reviewed

Direct link

Toroujeni, Seyyed Morteza Hashemi – Education and Information Technologies, 2022

Score interchangeability of Computerized Fixed-Length Linear Testing (henceforth CFLT) and Paper-and-Pencil-Based Testing (henceforth PPBT) has become a controversial issue over the last decade when technology has meaningfully restructured methods of the educational assessment. Given this controversy, various testing guidelines published on…

Descriptors: Computer Assisted Testing, Reading Tests, Reading Comprehension, Scoring

Automated Essay Scoring at Scale: A Case Study in Switzerland and Germany. TOEFL® Research Report. RR-86. ETS RR-19-12

Peer reviewed
PDF on ERIC

Download full text

Rupp, André A.; Casabianca, Jodi M.; Krüger, Maleika; Keller, Stefan; Köller, Olaf – ETS Research Report Series, 2019

In this research report, we describe the design and empirical findings for a large-scale study of essay writing ability with approximately 2,500 high school students in Germany and Switzerland on the basis of 2 tasks with 2 associated prompts, each from a standardized writing assessment whose scoring involved both human and automated components.…

Descriptors: Automation, Foreign Countries, English (Second Language), Language Tests

Do the TOEFL iBT® Section Scores Provide Value-Added Information to Stakeholders

Peer reviewed

Direct link

Sawaki, Yasuyo; Sinharay, Sandip – Language Testing, 2018

The present study examined the reliability of the reading, listening, speaking, and writing section scores for the TOEFL iBT® test and their interrelationship in order to collect empirical evidence to support, respectively, the "generalization" inference and the "explanation" inference in the TOEFL iBT validity argument…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Computer Assisted Testing

The Effect of Training and Rater Differences on Oral Proficiency Assessment

Peer reviewed

Direct link

Kang, Okim; Rubin, Don; Kermad, Alyssa – Language Testing, 2019

As a result of the fact that judgments of non-native speech are closely tied to social biases, oral proficiency ratings are susceptible to error because of rater background and social attitudes. In the present study we seek first to estimate the variance attributable to rater background and attitudinal variables on novice raters' assessments of L2…

Descriptors: Evaluators, Second Language Learning, Language Tests, English (Second Language)

The Influence of Training and Experience on Rater Performance in Scoring Spoken Language

Peer reviewed

Direct link

Davis, Larry – Language Testing, 2016

Two factors were investigated that are thought to contribute to consistency in rater scoring judgments: rater training and experience in scoring. Also considered were the relative effects of scoring rubrics and exemplars on rater performance. Experienced teachers of English (N = 20) scored recorded responses from the TOEFL iBT speaking test prior…

Descriptors: Evaluators, Oral Language, Scores, Language Tests

Age, Task Characteristics, and Acoustic Indicators of Engagement: Investigations into the Validity of a Technology-Enhanced Speaking Test for Young Language Learners

Download full text

Edward Paul Getman – Online Submission, 2020

Despite calls for engaging assessments targeting young language learners (YLLs) between 8 and 13 years old, what makes assessment tasks engaging and how such task characteristics affect measurement quality have not been well studied empirically. Furthermore, there has been a dearth of validity research about technology-enhanced speaking tests for…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Learner Engagement

Use of e-rater[R] in Scoring of the TOEFL iBT[R] Writing Test. Research Report. ETS RR-11-25

Download full text

Haberman, Shelby J. – Educational Testing Service, 2011

Alternative approaches are discussed for use of e-rater[R] to score the TOEFL iBT[R] Writing test. These approaches involve alternate criteria. In the 1st approach, the predicted variable is the expected rater score of the examinee's 2 essays. In the 2nd approach, the predicted variable is the expected rater score of 2 essay responses by the…

Descriptors: Writing Tests, Scoring, Essays, Language Tests

The Role of Lexical Properties and Cohesive Devices in Text Integration and Their Effect on Human Ratings of Speaking Proficiency

Peer reviewed

Direct link

Crossley, Scott; Clevinger, Amanda; Kim, YouJin – Language Assessment Quarterly, 2014

There has been a growing interest in the use of integrated tasks in the field of second language testing to enhance the authenticity of language tests. However, the role of text integration in test takers' performance has not been widely investigated. The purpose of the current study is to examine the effects of text-based relational (i.e.,…

Descriptors: Language Proficiency, Connected Discourse, Language Tests, English (Second Language)

Investigating the Value of Section Scores for the "TOEFL iBT"® Test. "TOEFL iBT"® Research Report. TOEFL iBT-21. ETS Research Report RR-13-35

Peer reviewed
PDF on ERIC

Download full text

Sawaki, Yasuyo; Sinharay, Sandip – ETS Research Report Series, 2013

This study investigates the value of reporting the reading, listening, speaking, and writing section scores for the "TOEFL iBT"® test, focusing on 4 related aspects of the psychometric quality of the TOEFL iBT section scores: reliability of the section scores, dimensionality of the test, presence of distinct score profiles, and the…

Descriptors: Scores, Computer Assisted Testing, Factor Analysis, Correlation

Rater Expertise in a Second Language Speaking Assessment: The Influence of Training and Experience

Direct link

Davis, Lawrence Edward – ProQuest LLC, 2012

Speaking performance tests typically employ raters to produce scores; accordingly, variability in raters' scoring decisions has important consequences for test reliability and validity. One such source of variability is the rater's level of expertise in scoring. Therefore, it is important to understand how raters' performance is influenced by…

Descriptors: Evaluators, Expertise, Scores, Second Language Learning

Developing Analytic Rating Guides for "TOEFL iBT"® Integrated Speaking Tasks. "TOEFL iBT"® Research Report, TOEFL iBT-20. ETS Research Report. RR-13-13

Peer reviewed
PDF on ERIC

Download full text

Jamieson, Joan; Poonpon, Kornwipa – ETS Research Report Series, 2013

Research and development of a new type of scoring rubric for the integrated speaking tasks of "TOEFL iBT"® are described. These "analytic rating guides" could be helpful if tasks modeled after those in TOEFL iBT were used for formative assessment, a purpose which is different from TOEFL iBT's primary use for admission…

Descriptors: Oral Language, Language Proficiency, Scaling, Scores

Toward Automated Multi-Trait Scoring of Essays: Investigating Links among Holistic, Analytic, and Text Feature Scores

Peer reviewed

Direct link

Lee, Yong-Won; Gentile, Claudia; Kantor, Robert – Applied Linguistics, 2010

The main purpose of the study was to investigate the distinctness and reliability of analytic (or multi-trait) rating dimensions and their relationships to holistic scores and "e-rater"[R] essay feature variables in the context of the TOEFL[R] computer-based test (TOEFL CBT) writing assessment. Data analyzed in the study were holistic…

Descriptors: Writing Evaluation, Writing Tests, Scoring, Essays

Practice on Assessing Grammar and Vocabulary: The Case of the TOEFL

Download full text

Zhuang, Xin – Online Submission, 2008

The Test of English as a Foreign Language (TOEFL) brings tremendous influence to EFL (English as a Foreign Language) learners worldwide. TOEFL 2000 project claims that TOEFL, as a more reflective of communicative model, could provide more information about international students' language ability that it is supposed to measure. However, after…

Descriptors: Second Languages, Computer Assisted Testing, Construct Validity, Test Reliability

Test Review: Test of English as a Foreign Language[TM]--Internet-Based Test (TOEFL iBT[R])

Peer reviewed

Direct link

Alderson, J. Charles – Language Testing, 2009

In this article, the author reviews the TOEFL iBT which is the latest version of the TOEFL, whose history stretches back to 1961. The TOEFL iBT was introduced in the USA, Canada, France, Germany and Italy in late 2005. Currently the TOEFL test is offered in two testing formats: (1) Internet-based testing (iBT); and (2) paper-based testing (PBT).…

Descriptors: Oral Language, Writing Tests, Listening Comprehension Tests, Test Reviews

Analytic Scoring of TOEFL® CBT Essays: Scores from Humans and "E-rater"®. TOEFL® Research Reports. RR-81. ETS RR-08-01

Peer reviewed
PDF on ERIC

Download full text

Lee, Yong-Won; Gentile, Claudia; Kantor, Robert – ETS Research Report Series, 2008

The main purpose of the study was to investigate the distinctness and reliability of analytic (or multitrait) rating dimensions and their relationships to holistic scores and "e-rater"® essay feature variables in the context of the TOEFL® computer-based test (CBT) writing assessment. Data analyzed in the study were analytic and holistic…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Scoring

Previous Page | Next Page »

Pages: 1 | 2

Lee, Yong-Won	3
Gentile, Claudia	2
Kantor, Robert	2
Manalo, Jonathan R.	2
Sawaki, Yasuyo	2
Sinharay, Sandip	2
Wolfe, Edward W.	2
Alderson, J. Charles	1
Attali, Yigal	1
Bejar, Isaac I.	1
Camp, Roberta	1
Carlson, Sybil B.	1
Casabianca, Jodi M.	1
Clevinger, Amanda	1
Crossley, Scott	1
Davis, Larry	1
Davis, Lawrence Edward	1
Edward Paul Getman	1
Haberman, Shelby J.	1
Hemat, Ramin	1
Jamieson, Joan	1
Kang, Okim	1
Keller, Stefan	1
Kermad, Alyssa	1
Kim, YouJin	1
More ▼