Publication Date
| In 2026 | 0 |
| Since 2025 | 1 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 13 |
| Since 2007 (last 20 years) | 22 |
Descriptor
| Language Proficiency | 44 |
| Test Validity | 44 |
| Language Tests | 36 |
| Scoring | 35 |
| English (Second Language) | 27 |
| Second Language Learning | 24 |
| Test Reliability | 21 |
| Test Construction | 19 |
| Foreign Countries | 9 |
| Language Usage | 8 |
| Test Items | 8 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 1 |
| Researchers | 1 |
Location
| China | 3 |
| Texas | 2 |
| Algeria | 1 |
| Australia | 1 |
| California | 1 |
| Canada | 1 |
| Colombia | 1 |
| Delaware | 1 |
| Europe | 1 |
| Florida | 1 |
| Germany | 1 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
| Test of English as a Foreign… | 8 |
| Graduate Record Examinations | 1 |
| International English… | 1 |
| Woodcock Johnson Tests of… | 1 |
| Woodcock Munoz Language Survey | 1 |
What Works Clearinghouse Rating
Sara T. Cushing – ETS Research Report Series, 2025
This report provides an in-depth comparison of TOEFL iBT® and the Duolingo English Test (DET) in terms of the degree to which both tests assess academic language proficiency in listening, reading, writing, and speaking. The analysis is based on publicly available documentation on both tests, including sample test questions available on the test…
Descriptors: Language Tests, English (Second Language), Second Language Learning, Academic Language
Peng, Yue; Yan, Wei; Cheng, Liying – Language Testing, 2021
This test review focuses on the current version (2009) of [Chinese characters omitted] (Hanyu Shuiping Kaoshi), literally translated as the Chinese Language Proficiency Test and abbreviated as HSK. Tailored to non-native speakers of the Chinese language, this test consists of six proficiency levels (Levels 1 and 2 as beginners, Levels 3 and 4 as…
Descriptors: Language Proficiency, Language Tests, Chinese, Decision Making
Papageorgiou, Spiros; Davis, Larry; Norris, John M.; Garcia Gomez, Pablo; Manna, Venessa F.; Monfils, Lora – Educational Testing Service, 2021
The "TOEFL® Essentials"™ test is a new English language proficiency test in the "TOEFL"® family of assessments. It measures foundational language skills and communication abilities in academic and general (daily life) contexts. The test covers the four language skills of reading, listening, writing, and speaking and is intended…
Descriptors: Language Tests, English (Second Language), Second Language Learning, Language Proficiency
Ji-young Shin – ProQuest LLC, 2021
The present dissertation investigated the impact of scales/scoring methods and prompt linguistic features on the measurement quality of L2 English elicited imitation (EI). Scales/scoring methods are an important feature for the validity and reliability of L2 EI test, but less is known (Yan et al., 2016). Prompt linguistic features are also known…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Semantics
Collier, Jo-Kate; Huang, Becky – Language Assessment Quarterly, 2020
This article presents a critical review of the Texas English Language Proficiency Assessment System (TELPAS), a large scale standardized English language proficiency (ELP) assessment developed by the Texas Education Agency (TEA) and administered since 2004. TELPAS is used as an annual summative assessment for all English Learners (ELs) in grades…
Descriptors: English (Second Language), Language Proficiency, Language Tests, Standardized Tests
Coniam, David; Lee, Tony; Milanovic, Michael; Pike, Nigel; Zhao, Wen – Language Education & Assessment, 2022
The calibration of test materials generally involves the interaction between empirical analysis and expert judgement. This paper explores the extent to which scale familiarity might affect expert judgement as a component of test validation in the calibration process. It forms part of a larger study that investigates the alignment of the…
Descriptors: Specialists, Language Tests, Test Validity, College Faculty
Davis, Larry; Norris, John – ETS Research Report Series, 2021
The elicited imitation task (EIT), in which language learners listen to a series of spoken sentences and repeat each one verbatim, is a commonly used measure of language proficiency in second language acquisition research. The "TOEFL® Essentials"™ test includes an EIT as a holistic measure of speaking proficiency, referred to as the…
Descriptors: Task Analysis, Language Proficiency, Speech Communication, Language Tests
Faulkner-Bond, Molly; Sireci, Stephen G. – International Journal of Testing, 2015
Throughout the world, tests are administered to some examinees who are not fully proficient in the language in which they are being tested. It has long been acknowledged that proficiency in the language in which a test is administered often affects examinees' performance on a test. Depending on the context and intended uses for a particular…
Descriptors: Language Minorities, Test Validity, Language Proficiency, Test Construction
Montroy, Janelle J.; Zucker, Tricia A.; Assel, Michael M.; Landry, Susan H.; Anthony, Jason L.; Williams, Jeffrey M.; Hsu, Hsien-Yuan; Crawford, April; Johnson, Ursula Y.; Carlo, Maria S.; Taylor, Heather B. – Early Education and Development, 2020
There is a significant need for kindergarten entry assessments (KEA) that meet state education agency (SEA) requirements and are psychometrically sound measures of a broad range of school readiness domains such as language, literacy, math, science, executive function, and social-emotional skills. Research Findings: In this paper, we describe five…
Descriptors: Kindergarten, School Readiness, Student Evaluation, Test Construction
Ackerman, Debra L. – ETS Research Report Series, 2018
In this report I share the results of a document-based, comparative case study aimed at increasing our understanding about the potential utility of state kindergarten entry assessments (KEAs) to provide evidence of English learner (EL) kindergartners' knowledge and skills and, in turn, inform kindergarten teachers' instruction. Using a sample of 9…
Descriptors: English Language Learners, Kindergarten, School Readiness, Knowledge Level
Aviad-Levitzky, Tami; Laufer, Batia; Goldstein, Zahava – Language Assessment Quarterly, 2019
This article describes the development and validation of the new CATSS (Computer Adaptive Test of Size and Strength), which measures vocabulary knowledge in four modalities -- productive recall, receptive recall, productive recognition, and receptive recognition. In the first part of the paper we present the assumptions that underlie the test --…
Descriptors: Foreign Countries, Test Construction, Test Validity, Test Reliability
Chen, Jing; Zhang, Mo; Bejar, Isaac I. – ETS Research Report Series, 2017
Automated essay scoring (AES) generally computes essay scores as a function of macrofeatures derived from a set of microfeatures extracted from the text using natural language processing (NLP). In the "e-rater"® automated scoring engine, developed at "Educational Testing Service" (ETS) for the automated scoring of essays, each…
Descriptors: Computer Assisted Testing, Scoring, Automation, Essay Tests
Elicited Imitation as a Measure of Second Language Proficiency: A Narrative Review and Meta-Analysis
Yan, Xun; Maeda, Yukiko; Lv, Jing; Ginther, April – Language Testing, 2016
Elicited imitation (EI) has been widely used to examine second language (L2) proficiency and development and was an especially popular method in the 1970s and early 1980s. However, as the field embraced more communicative approaches to both instruction and assessment, the use of EI diminished, and the construct-related validity of EI scores as a…
Descriptors: Second Language Learning, Language Proficiency, Meta Analysis, Effect Size
Crosthwaite, Peter Robert; Raquel, Michelle – Language Assessment Quarterly, 2019
This study determines the fine-grained bottom-up linguistic features involved in successful second language (L2) English academic group oral tutorial discussion through the use of a spoken learner corpus composed of more than 20 hrs of L2 production. Student performances were graded by teacher-raters using a can-do rating scale, which assessed…
Descriptors: Computational Linguistics, Second Language Learning, Second Language Instruction, Error Patterns
Bogorevich, Valeriia – ProQuest LLC, 2018
Rater variation in performance assessment can impact test-takers' scores and compromise assessments' fairness and validity (Crooks, Kane, & Cohen, 1996). Rater variation can also undermine a test's validity and fairness; therefore, it is important to investigate raters' scoring patterns in order to inform rater training. Substantial work has…
Descriptors: Pronunciation, Familiarity, English (Second Language), Second Language Learning

Peer reviewed
Direct link
