Publication Date
In 2025 | 2 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 6 |
Since 2006 (last 20 years) | 8 |
Descriptor
Scores | 15 |
Test Length | 15 |
Test Validity | 12 |
Test Reliability | 8 |
Language Tests | 6 |
Test Construction | 4 |
Testing | 4 |
Computer Assisted Testing | 3 |
English (Second Language) | 3 |
Higher Education | 3 |
Language Proficiency | 3 |
More ▼ |
Source
Language Testing | 3 |
Applied Measurement in… | 2 |
ACT Education Corp. | 1 |
ERS Spectrum | 1 |
Educational and Psychological… | 1 |
Grantee Submission | 1 |
Language Assessment Quarterly | 1 |
ProQuest LLC | 1 |
School Psychology Review | 1 |
Author
Hambleton, Ronald K. | 2 |
Bruce, K. | 1 |
Campbell, Todd | 1 |
Christiansen, Neil D. | 1 |
Drackert, Anastasia | 1 |
Egley, Robert J. | 1 |
Hakyung Sung | 1 |
Halderman, L. | 1 |
Hanson, Dave | 1 |
Isbell, Daniel R. | 1 |
Jeff Allen | 1 |
More ▼ |
Publication Type
Journal Articles | 9 |
Reports - Research | 7 |
Reports - Evaluative | 5 |
Reports - Descriptive | 2 |
Dissertations/Theses -… | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Middle Schools | 3 |
Elementary Education | 2 |
Secondary Education | 2 |
Grade 6 | 1 |
High Schools | 1 |
Higher Education | 1 |
Intermediate Grades | 1 |
Junior High Schools | 1 |
Postsecondary Education | 1 |
Audience
Researchers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 2 |
ACT Assessment | 1 |
Bem Sex Role Inventory | 1 |
Florida Comprehensive… | 1 |
International English… | 1 |
Peabody Picture Vocabulary… | 1 |
What Works Clearinghouse Rating
Hakyung Sung; Sooyeon Cho; Kristopher Kyle – Language Assessment Quarterly, 2024
Lexical diversity (LD) is an important indicator of second language lexical development. Much research has investigated LD indices, with a focus on learners of English. However, further research is needed in languages that are typologically distinct from English, such as Korean. In this study, we evaluated the reliability and validity of LD…
Descriptors: Second Language Learning, Korean, Persuasive Discourse, Language Tests
Jeff Allen; Ty Cruce – ACT Education Corp., 2025
This report summarizes some of the evidence supporting interpretations of scores from the enhanced ACT, focusing on reliability, concurrent validity, predictive validity, and score comparability. The authors argue that the evidence presented in this report supports the interpretation of scores from the enhanced ACT as measures of high school…
Descriptors: College Entrance Examinations, Testing, Change, Scores
Ying Xu; Xiaodong Li; Jin Chen – Language Testing, 2025
This article provides a detailed review of the Computer-based English Listening Speaking Test (CELST) used in Guangdong, China, as part of the National Matriculation English Test (NMET) to assess students' English proficiency. The CELST measures listening and speaking skills as outlined in the "English Curriculum for Senior Middle…
Descriptors: Computer Assisted Testing, English (Second Language), Language Tests, Listening Comprehension Tests
Isbell, Daniel R.; Kremmel, Benjamin – Language Testing, 2020
Administration of high-stakes language proficiency tests has been disrupted in many parts of the world as a result of the 2019 novel coronavirus pandemic. Institutions that rely on test scores have been forced to adapt, and in many cases this means using scores from a different test, or a new online version of an existing test, that can be taken…
Descriptors: Language Tests, High Stakes Tests, Language Proficiency, Second Language Learning
Norris, John; Drackert, Anastasia – Language Testing, 2018
The Test of German as a Foreign Language (TestDaF) plays a critical role as a standardized test of German language proficiency. Developed and administered by the Society for Academic Study Preparation and Test Development (g.a.s.t.), TestDaF was launched in 2001 and has experienced persistent annual growth, with more than 44,000 test takers in…
Descriptors: German, Second Language Learning, Language Tests, Language Proficiency
Steinkamp, Susan Christa – ProQuest LLC, 2017
For test scores that rely on the accurate estimation of ability via an IRT model, their use and interpretation is dependent upon the assumption that the IRT model fits the data. Examinees who do not put forth full effort in answering test questions, have prior knowledge of test content, or do not approach a test with the intent of answering…
Descriptors: Test Items, Item Response Theory, Scores, Test Wiseness
Sabatini, J.; O'Reilly, T.; Halderman, L.; Bruce, K. – Grantee Submission, 2014
Existing reading comprehension assessments have been criticized by researchers, educators, and policy makers, especially regarding their coverage, utility, and authenticity. The purpose of the current study was to evaluate a new assessment of reading comprehension that was designed to broaden the construct of reading. In light of these issues, we…
Descriptors: Reading Comprehension, Vignettes, Reading Tests, Elementary School Students

Christiansen, Neil D.; And Others – Educational and Psychological Measurement, 1996
The usefulness of examining the structural validity of scores on multidimensional measures using nested hierarchical model comparisons was evaluated in 2 studies using the Social Problem Solving Inventory (SPSI) with samples of 464 and 216 undergraduates. Results support the conceptual model of the SPSI. (SLD)
Descriptors: Comparative Analysis, Construct Validity, Higher Education, Interpersonal Relationship

Kipps, Debi; Hanson, Dave – School Psychology Review, 1983
The Peabody Picture Vocabulary Test-Revised (Dunn and Dunn) is described as a convenient, quick test, possessing improvements over the original. It measures a subject's receptive (hearing) vocabulary for Standard American English. However, the validity information for the test is less than adequate, since no validity studies are presented for it.…
Descriptors: Auditory Tests, Individual Testing, Scores, Test Length
Wise, Steven L. – Applied Measurement in Education, 2006
In low-stakes testing, the motivation levels of examinees are often a matter of concern to test givers because a lack of examinee effort represents a direct threat to the validity of the test data. This study investigated the use of response time to assess the amount of examinee effort received by individual test items. In 2 studies, it was found…
Descriptors: Computer Assisted Testing, Motivation, Test Validity, Item Response Theory
Campbell, Todd; And Others – 1995
In the early 1970s A. Constantinople wrote a seminal article that led to the development of the construct of psychological androgyny. The Bem Sex-Role Inventory is a popular measure of the construct, but the measure remains controversial. The construct validity of scores from the measure was explored using confirmatory factor analysis on data from…
Descriptors: Androgyny, College Students, Construct Validity, Factor Structure
Hambleton, Ronald K. – 1986
The problem of determining optimal test lengths with fixed total testing time has proved to be a difficult one for criterion-referenced test developers. An algorithm is needed which can be used by test developers to allocate available testing time to maximize the validity of their total criterion-referenced tests or testing programs. To be…
Descriptors: Algorithms, Criterion Referenced Tests, Elementary Secondary Education, Psychometrics

Linn, Robert L.; Hambleton, Ronald K. – Applied Measurement in Education, 1991
Four main approaches to customized testing are described, and their resulting scores' valid uses and interpretations are discussed. Customized testing can yield valid normative and curriculum-specific information, although cautious application is needed to avoid misleading inferences about student achievement. (SLD)
Descriptors: Academic Achievement, Accountability, Criterion Referenced Tests, Curriculum
Jones, Brett D.; Egley, Robert J. – ERS Spectrum, 2005
The purpose of this paper is to discuss Florida teachers' recommendations for improving the Florida Comprehensive Assessment Test (FCAT) and to compare their recommendations with those of Florida administrators. Although teachers' suggestions varied as to the types and extent of remedies needed to improve the FCAT, some common themes emerged. The…
Descriptors: Test Results, Core Curriculum, Student Evaluation, Accountability
Ragosta, Marjorie; Nelson, Catherine – 1986
The Test of English as a Foreign Language (TOEFL) was administered to 26 hearing impaired college students, in order to test the assumption that the English-language deficiencies of hearing impaired students are similar to those of foreign students. The students were attending Gallaudet College's School of Preparatory Studies and were identified…
Descriptors: American Sign Language, College Entrance Examinations, Deaf Interpreting, Deafness