Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 5 |
Descriptor
Language Tests | 14 |
Test Reliability | 14 |
Test Theory | 14 |
English (Second Language) | 7 |
Language Proficiency | 7 |
Test Validity | 5 |
Item Analysis | 4 |
Second Language Learning | 4 |
Statistical Analysis | 4 |
Item Response Theory | 3 |
Testing | 3 |
More ▼ |
Source
Author
Salmani-Nodoushan, Mohammad… | 2 |
Bachman, Lyle F. | 1 |
Bashaw, W. L. | 1 |
Bernknopf, Stanley | 1 |
Brown, James Dean | 1 |
Davidson, Fred | 1 |
Douglas, Dan | 1 |
Haberman, Shelby J. | 1 |
Hua, Te-Fang | 1 |
Kim, Peter | 1 |
Miller, Leah D. | 1 |
More ▼ |
Publication Type
Reports - Research | 9 |
Journal Articles | 8 |
Speeches/Meeting Papers | 4 |
Reports - Descriptive | 3 |
Information Analyses | 1 |
Reports - Evaluative | 1 |
Education Level
Adult Education | 1 |
Audience
Location
Indonesia | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 2 |
What Works Clearinghouse Rating
Kim, Peter – Language Teaching Research Quarterly, 2021
Foreign language aptitude is defined as one's potential to learn a second language. A language learner with higher aptitude is predicted to learn more, faster, and reach a higher level of proficiency. If this is the case, one way to validate the construct of aptitude and its measure is to conduct a validation study in which measures of aptitude is…
Descriptors: Morphology (Languages), Syntax, Second Language Learning, Second Language Instruction
Retnawati, Heri – Turkish Online Journal of Educational Technology - TOJET, 2015
This study aimed to compare the accuracy of the test scores as results of Test of English Proficiency (TOEP) based on paper and pencil test (PPT) versus computer-based test (CBT). Using the participants' responses to the PPT documented from 2008-2010 and data of CBT TOEP documented in 2013-2014 on the sets of 1A, 2A, and 3A for the Listening and…
Descriptors: Scores, Accuracy, Computer Assisted Testing, English (Second Language)
Haberman, Shelby J. – Educational Testing Service, 2011
Alternative approaches are discussed for use of e-rater[R] to score the TOEFL iBT[R] Writing test. These approaches involve alternate criteria. In the 1st approach, the predicted variable is the expected rater score of the examinee's 2 essays. In the 2nd approach, the predicted variable is the expected rater score of 2 essay responses by the…
Descriptors: Writing Tests, Scoring, Essays, Language Tests
Salmani-Nodoushan, Mohammad Ali – Journal on Educational Psychology, 2009
A good test is one that has at least three qualities: reliability, or the precision with which a test measures what it is supposed to measure; validity, i.e., if the test really measures what it is supposed to measure, and practicality, or if the test, no matter how sound theoretically, is practicable in reality. These are the sine qua non for any…
Descriptors: Generalizability Theory, Testing, Language Tests, Item Response Theory
Salmani-Nodoushan, Mohammad Ali – Online Submission, 2009
A good test is one that has at least three qualities: reliability, or the precision with which a test measures what it is supposed to measure; validity, i.e., if the test really measures what it is supposed to measure; and practicality, or if the test, no matter how sound theoretically, is practicable in reality. These are the sine qua non for…
Descriptors: Generalizability Theory, Testing, Language Tests, Item Response Theory
Brown, James Dean; Ross, Jacqueline A. – 1993
This study investigates the Test of English as a Foreign Language (TOEFL), in particular the relative contributions to score dependability (analogous to classical theory reliability) of various numbers of items and subtests as well as the decision dependability at different cut points. Research questions that apply to the overall TOEFL battery and…
Descriptors: English (Second Language), Language Tests, Statistical Analysis, Test Reliability
Moy, Raymond – 1982
Score equating requires that the forms to be equated are functionally parallel. That is, the two test forms should rank order examinees in a similar fashion. In language proficiency testing situations, this assumption is often put into doubt because of the numerous tests that have been proposed as measures of language proficiency and the…
Descriptors: Equated Scores, Language Proficiency, Language Tests, Latent Trait Theory

Davidson, Fred – System, 2000
Statistical analysis tools in language testing are described, chiefly classical test theory and item response theory. Computer software for statistical analysis is briefly reviewed and divided into three tiers: commonly available; statistical packages; and specialty software. (Author/VWL)
Descriptors: Computer Software, Language Tests, Second Language Learning, Statistical Analysis
de Jong, John H. A. L. – Taaltoetsen: Toegepaste taalwetenschapin artikelen 31, 1988
The one-parameter psychometric model known as the Rasch model is described and examined. The basic principles underlying the model and the concepts of unidimensionality, local stochastic independence, and additivity are explained in non-mathematical terms. The requirements of measurement procedures, the measurement of latent traits, the control on…
Descriptors: English (Second Language), French, Language Tests, Listening Comprehension Tests

Perkins, Kyle; Miller, Leah D. – Language Testing, 1984
Describes a study which submitted data from a multiple-choice English as a second language reading comprehension test to classical test theory item analysis and latent trait measurement. The purpose was to identify weak items and to compare the number of weak items indicated by the two different approaches. (SED)
Descriptors: English (Second Language), Language Tests, Latent Trait Theory, Reading Comprehension
Ross, Steven; Hua, Te-Fang – 1994
A general issue related to language program development involves the empirical rationalization of cut score decisions in criterion-referenced language tests. Cut score dependability focuses on the consistency of the decisions in repeated testing or the assessment of language learner performances. In this case, the issue is to determine the optimal…
Descriptors: Achievement Gains, Criterion Referenced Tests, English (Second Language), Higher Education

Douglas, Dan – Annual Review of Applied Linguistics, 1995
Reviews recent theoretical, methodological, and analytical developments in language testing, focusing on more refined models of language ability, reliability and validity, performance testing, innovative test formats, new applications of Item Response Theory and Generalizability Theory to test performance. An annotated bibliography discusses seven…
Descriptors: Annotated Bibliographies, Evaluation Methods, Language Proficiency, Language Tests
Bernknopf, Stanley; Bashaw, W. L. – 1976
The present study was designed to examine whether or not traditional procedures concerning item selection and reliability are both applicable and appropriate for criterion-referenced (CR) tests. It was also designed to examine traditional procedures and those designed especially for CR testing in relation to test variance and item homogeneity.…
Descriptors: Career Development, Comparative Analysis, Criterion Referenced Tests, Item Analysis
Bachman, Lyle F.; And Others – 1993
This paper outlines the development of a performance assessment measure of language speaking ability, the Language Ability Assessment System (LAAS), which is highly reliable and can be examined for reliability through modern measurement theories, such as generalizability theory (G-theory) and the many-facet Rasch theory. LAAS was developed to…
Descriptors: College Students, Higher Education, Interrater Reliability, Language Proficiency