NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 7 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Polat, Murat – International Online Journal of Education and Teaching, 2022
Foreign language testing is a multi-dimensional phenomenon and obtaining objective and error-free scores on learners' language skills is often problematic. While assessing foreign language performance on high-stakes tests, using different testing approaches including Classical Test Theory (CTT), Generalizability Theory (GT) and/or Item Response…
Descriptors: Second Language Learning, Second Language Instruction, Item Response Theory, Language Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Longabach, Tanya; Peyton, Vicki – Language Testing, 2018
K-12 English language proficiency tests that assess multiple content domains (e.g., listening, speaking, reading, writing) often have subsections based on these content domains; scores assigned to these subsections are commonly known as subscores. Testing programs face increasing customer demands for the reporting of subscores in addition to the…
Descriptors: Comparative Analysis, Test Reliability, Second Language Learning, Language Proficiency
Peer reviewed Peer reviewed
Direct linkDirect link
Zhao, Ping; Ji, Xiaoli – RELC Journal: A Journal of Language Teaching and Research, 2018
This article provides preliminary validity evidence for the shorter Mandarin version of the Vocabulary Size Test (VST) under the content aspect, technical quality, substantive and generalizability aspect of Messick's (1995) construct validity framework. The shorter version with 177 Chinese university students in three proficiency levels indicates…
Descriptors: Language Tests, Test Validity, Mandarin Chinese, Second Language Learning
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Retnawati, Heri – Turkish Online Journal of Educational Technology - TOJET, 2015
This study aimed to compare the accuracy of the test scores as results of Test of English Proficiency (TOEP) based on paper and pencil test (PPT) versus computer-based test (CBT). Using the participants' responses to the PPT documented from 2008-2010 and data of CBT TOEP documented in 2013-2014 on the sets of 1A, 2A, and 3A for the Listening and…
Descriptors: Scores, Accuracy, Computer Assisted Testing, English (Second Language)
Ellis, David P. – ProQuest LLC, 2011
The current version of the International Language Testing Association (ILTA) Guidelines for Practice requires language testers to pretest items before including them on an exam, or when pretesting is not possible, to conduct post-hoc item analysis to ensure any malfunctioning items are excluded from scoring. However, the guidelines are devoid of…
Descriptors: Item Response Theory, High Stakes Tests, College Entrance Examinations, Item Analysis
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Salmani-Nodoushan, Mohammad Ali – Journal on Educational Psychology, 2009
A good test is one that has at least three qualities: reliability, or the precision with which a test measures what it is supposed to measure; validity, i.e., if the test really measures what it is supposed to measure, and practicality, or if the test, no matter how sound theoretically, is practicable in reality. These are the sine qua non for any…
Descriptors: Generalizability Theory, Testing, Language Tests, Item Response Theory
Salmani-Nodoushan, Mohammad Ali – Online Submission, 2009
A good test is one that has at least three qualities: reliability, or the precision with which a test measures what it is supposed to measure; validity, i.e., if the test really measures what it is supposed to measure; and practicality, or if the test, no matter how sound theoretically, is practicable in reality. These are the sine qua non for…
Descriptors: Generalizability Theory, Testing, Language Tests, Item Response Theory