ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	1
Since 2007 (last 20 years)	5

Descriptor

Language Tests	14
Test Reliability	14
Test Theory	14
English (Second Language)	7
Language Proficiency	7
Test Validity	5
Item Analysis	4
Second Language Learning	4
Statistical Analysis	4
Item Response Theory	3
Testing	3
Accuracy	2
Comparative Analysis	2
Computer Assisted Testing	2
Correlation	2
Criterion Referenced Tests	2
Generalizability Theory	2
Higher Education	2
Interrater Reliability	2
Latent Trait Theory	2
Models	2
Scores	2
Second Language Instruction	2
Spanish	2
Test Items	2
More ▼

Source

Annual Review of Applied…	1
Educational Testing Service	1
Journal on Educational…	1
Language Teaching Research…	1
Language Testing	1
Online Submission	1
System	1
Taaltoetsen: Toegepaste…	1
Turkish Online Journal of…	1

Author

Salmani-Nodoushan, Mohammad…	2
Bachman, Lyle F.	1
Bashaw, W. L.	1
Bernknopf, Stanley	1
Brown, James Dean	1
Davidson, Fred	1
Douglas, Dan	1
Haberman, Shelby J.	1
Hua, Te-Fang	1
Kim, Peter	1
Miller, Leah D.	1
Moy, Raymond	1
Perkins, Kyle	1
Retnawati, Heri	1
Ross, Jacqueline A.	1
Ross, Steven	1
de Jong, John H. A. L.	1
More ▼

Publication Type

Reports - Research	9
Journal Articles	8
Speeches/Meeting Papers	4
Reports - Descriptive	3
Information Analyses	1
Reports - Evaluative	1

Education Level

Adult Education

Audience

Location

Indonesia

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Concurrent Validity of LLAMA_F: Measure of Language Analytic Ability as a Predictor of Morphosyntax Knowledge

Peer reviewed
PDF on ERIC

Download full text

Kim, Peter – Language Teaching Research Quarterly, 2021

Foreign language aptitude is defined as one's potential to learn a second language. A language learner with higher aptitude is predicted to learn more, faster, and reach a higher level of proficiency. If this is the case, one way to validate the construct of aptitude and its measure is to conduct a validation study in which measures of aptitude is…

Descriptors: Morphology (Languages), Syntax, Second Language Learning, Second Language Instruction

The Comparison of Accuracy Scores on the Paper and Pencil Testing vs. Computer-Based Testing

Peer reviewed
PDF on ERIC

Download full text

Retnawati, Heri – Turkish Online Journal of Educational Technology - TOJET, 2015

This study aimed to compare the accuracy of the test scores as results of Test of English Proficiency (TOEP) based on paper and pencil test (PPT) versus computer-based test (CBT). Using the participants' responses to the PPT documented from 2008-2010 and data of CBT TOEP documented in 2013-2014 on the sets of 1A, 2A, and 3A for the Listening and…

Descriptors: Scores, Accuracy, Computer Assisted Testing, English (Second Language)

Use of e-rater[R] in Scoring of the TOEFL iBT[R] Writing Test. Research Report. ETS RR-11-25

Download full text

Haberman, Shelby J. – Educational Testing Service, 2011

Alternative approaches are discussed for use of e-rater[R] to score the TOEFL iBT[R] Writing test. These approaches involve alternate criteria. In the 1st approach, the predicted variable is the expected rater score of the examinee's 2 essays. In the 2nd approach, the predicted variable is the expected rater score of 2 essay responses by the…

Descriptors: Writing Tests, Scoring, Essays, Language Tests

Measurement Theory in Language Testing: Past Traditions and Current Trends

Peer reviewed
PDF on ERIC

Download full text

Salmani-Nodoushan, Mohammad Ali – Journal on Educational Psychology, 2009

A good test is one that has at least three qualities: reliability, or the precision with which a test measures what it is supposed to measure; validity, i.e., if the test really measures what it is supposed to measure, and practicality, or if the test, no matter how sound theoretically, is practicable in reality. These are the sine qua non for any…

Descriptors: Generalizability Theory, Testing, Language Tests, Item Response Theory

Measurement Theory in Language Testing: Past Traditions and Current Trends

Download full text

Salmani-Nodoushan, Mohammad Ali – Online Submission, 2009

A good test is one that has at least three qualities: reliability, or the precision with which a test measures what it is supposed to measure; validity, i.e., if the test really measures what it is supposed to measure; and practicality, or if the test, no matter how sound theoretically, is practicable in reality. These are the sine qua non for…

Descriptors: Generalizability Theory, Testing, Language Tests, Item Response Theory

Decision Dependability of Subtests, Tests, and the Overall TOEFL Test Battery.

Download full text

Brown, James Dean; Ross, Jacqueline A. – 1993

This study investigates the Test of English as a Foreign Language (TOEFL), in particular the relative contributions to score dependability (analogous to classical theory reliability) of various numbers of items and subtests as well as the decision dependability at different cut points. Research questions that apply to the overall TOEFL battery and…

Descriptors: English (Second Language), Language Tests, Statistical Analysis, Test Reliability

Score Equating and Nominally Parallel Language Tests.

Moy, Raymond – 1982

Score equating requires that the forms to be equated are functionally parallel. That is, the two test forms should rank order examinees in a similar fashion. In language proficiency testing situations, this assumption is often put into doubt because of the numerous tests that have been proposed as measures of language proficiency and the…

Descriptors: Equated Scores, Language Proficiency, Language Tests, Latent Trait Theory

The Language Tester's Statistical Toolbox.

Peer reviewed

Davidson, Fred – System, 2000

Statistical analysis tools in language testing are described, chiefly classical test theory and item response theory. Computer software for statistical analysis is briefly reviewed and divided into three tiers: commonly available; statistical packages; and specialty software. (Author/VWL)

Descriptors: Computer Software, Language Tests, Second Language Learning, Statistical Analysis

Le Modele de Rasch: les principes sous-jacents et son application a la validation de tests (The Rasch Model: Underlying Principles and Application to Test Validation).

Download full text

de Jong, John H. A. L. – Taaltoetsen: Toegepaste taalwetenschapin artikelen 31, 1988

The one-parameter psychometric model known as the Rasch model is described and examined. The basic principles underlying the model and the concepts of unidimensionality, local stochastic independence, and additivity are explained in non-mathematical terms. The requirements of measurement procedures, the measurement of latent traits, the control on…

Descriptors: English (Second Language), French, Language Tests, Listening Comprehension Tests

Comparative Analyses of English as a Second Language Reading Comprehension Data: Classical Test Theory and Latent Trait Measurement.

Peer reviewed

Perkins, Kyle; Miller, Leah D. – Language Testing, 1984

Describes a study which submitted data from a multiple-choice English as a second language reading comprehension test to classical test theory item analysis and latent trait measurement. The purpose was to identify weak items and to compare the number of weak items indicated by the two different approaches. (SED)

Descriptors: English (Second Language), Language Tests, Latent Trait Theory, Reading Comprehension

An Approach to Gain Score Dependability and Validity for Criterion-Referenced Language Tests.

Download full text

Ross, Steven; Hua, Te-Fang – 1994

A general issue related to language program development involves the empirical rationalization of cut score decisions in criterion-referenced language tests. Cut score dependability focuses on the consistency of the decisions in repeated testing or the assessment of language learner performances. In this case, the issue is to determine the optimal…

Descriptors: Achievement Gains, Criterion Referenced Tests, English (Second Language), Higher Education

Developments in Language Testing.

Peer reviewed

Douglas, Dan – Annual Review of Applied Linguistics, 1995

Reviews recent theoretical, methodological, and analytical developments in language testing, focusing on more refined models of language ability, reliability and validity, performance testing, innovative test formats, new applications of Item Response Theory and Generalizability Theory to test performance. An annotated bibliography discusses seven…

Descriptors: Annotated Bibliographies, Evaluation Methods, Language Proficiency, Language Tests

An Investigation of Criterion-Referenced Tests Under Different Conditions of Sample Variability and Item Homogeneity.

Bernknopf, Stanley; Bashaw, W. L. – 1976

The present study was designed to examine whether or not traditional procedures concerning item selection and reliability are both applicable and appropriate for criterion-referenced (CR) tests. It was also designed to examine traditional procedures and those designed especially for CR testing in relation to test variance and item homogeneity.…

Descriptors: Career Development, Comparative Analysis, Criterion Referenced Tests, Item Analysis

Investigating Variability in Tasks and Rater Judgments in a Performance Test of Foreign Language Speaking.

Download full text

Bachman, Lyle F.; And Others – 1993

This paper outlines the development of a performance assessment measure of language speaking ability, the Language Ability Assessment System (LAAS), which is highly reliable and can be examined for reliability through modern measurement theories, such as generalizability theory (G-theory) and the many-facet Rasch theory. LAAS was developed to…

Descriptors: College Students, Higher Education, Interrater Reliability, Language Proficiency