Publication Date
| In 2026 | 0 |
| Since 2025 | 10 |
| Since 2022 (last 5 years) | 54 |
| Since 2017 (last 10 years) | 97 |
| Since 2007 (last 20 years) | 163 |
Descriptor
| Test Format | 506 |
| Test Validity | 506 |
| Test Reliability | 243 |
| Test Construction | 180 |
| Test Items | 127 |
| Foreign Countries | 108 |
| Language Tests | 96 |
| Higher Education | 86 |
| Testing | 80 |
| Computer Assisted Testing | 72 |
| Test Use | 67 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Higher Education | 60 |
| Postsecondary Education | 50 |
| Secondary Education | 30 |
| Elementary Education | 25 |
| Middle Schools | 19 |
| Junior High Schools | 15 |
| High Schools | 13 |
| Grade 8 | 11 |
| Grade 4 | 9 |
| Elementary Secondary Education | 8 |
| Grade 5 | 8 |
| More ▼ | |
Audience
| Practitioners | 30 |
| Teachers | 19 |
| Administrators | 17 |
| Researchers | 9 |
| Community | 1 |
| Policymakers | 1 |
| Students | 1 |
| Support Staff | 1 |
Location
| Canada | 10 |
| China | 9 |
| New York | 9 |
| Japan | 7 |
| Netherlands | 6 |
| Germany | 5 |
| Turkey | 5 |
| United Kingdom | 5 |
| United Kingdom (England) | 5 |
| Australia | 4 |
| Georgia | 4 |
| More ▼ | |
Laws, Policies, & Programs
| Elementary and Secondary… | 1 |
| Individuals with Disabilities… | 1 |
| Job Training Partnership Act… | 1 |
| No Child Left Behind Act 2001 | 1 |
| Pell Grant Program | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedWaters, L. K.; And Others – Perceptual and Motor Skills, 1982
Multitrait-multimethod analysis was performed on instructors' ratings from behaviorally anchored rating scales, graphic rating scales, and mixed standard scales. Two samples of 100 undergraduate students were distinguished on the basis of whether the statements on the mixed-standard scale were behaviorally specific or more generic descriptions of…
Descriptors: Behavior Rating Scales, Discriminant Analysis, Higher Education, Interrater Reliability
Peer reviewedKupermintz, Haggai; Snow, Richard E. – American Educational Research Journal, 1997
This study, third in a series using data from the National Education Longitudinal Study of 1988 (NELS:88), demonstrates the usefulness of multidimensional representation of mathematics achievement as it extends analyses through grade 12. Findings support a distinction between mathematical reasoning and knowledge for two of the three test forms.…
Descriptors: Achievement Tests, Educational Assessment, High Schools, Knowledge Level
Peer reviewedWoods, Caroline; Neather, Ted – Language Learning Journal, 1994
This article discusses issues involved in second language testing at the secondary school level in Britain. It focuses on the rationale for second language testing and the need to address criticisms of language tests, such as test validity and authenticity of tasks. (nine references) (MDM)
Descriptors: Foreign Countries, Language Tests, Second Language Instruction, Second Language Programs
Peer reviewedTrigwell, Keith; Sleet, Ray – Assessment and Evaluation in Higher Education, 1990
A study compared university chemistry students' (n=19) performances in creativity exercises, concept mapping, and traditional examinations. Performance in all three correlated with deep study strategy, but low correlations between the three methods suggests that they test different aspects of chemistry knowledge. Use and integration of all three…
Descriptors: Chemistry, Cognitive Structures, College Students, Comparative Analysis
Peer reviewedSchriesheim, Chester A.; And Others – Educational and Psychological Measurement, 1991
Effects of item wording on questionnaire reliability and validity were studied, using 280 undergraduate business students who completed a questionnaire comprising 4 item types: (1) regular; (2) polar opposite; (3) negated polar opposite; and (4) negated regular. Implications of results favoring regular and negated regular items are discussed. (SLD)
Descriptors: Business Education, Comparative Testing, Higher Education, Negative Forms (Language)
Peer reviewedCharak, David A.; Stella, Jennifer L. – Assessment for Effective Intervention, 2002
This article provides in-depth information regarding the most commonly used instruments for the screening or diagnosis of autistic spectrum disorders. Reliability, validity, format, and target population are presented to help clinicians select appropriate diagnostic measures. Future directions in the development of new instruments are discussed.…
Descriptors: Adolescents, Adults, Autism, Children
Chakwera, Elias; Khembo, Dafter; Sireci, Stephen G. – Education Policy Analysis Archives, 2004
In the United States, tests are held to high standards of quality. In developing countries such as Malawi, psychometricians must deal with these same high standards as well as several additional pressures such as widespread cheating, test administration difficulties due to challenging landscapes and poor resources, difficulties in reliably scoring…
Descriptors: Testing Programs, Testing, High Stakes Tests, Measurement
Kopriva, Rebecca J.; Wiley, David E.; Emick, Jessica – Online Submission, 2007
The goal of the current study was to examine the influence of providing more optimal testing conditions and evaluate the effect this has on the validity of the score inferences across ELL students with different needs, strengths, and levels of language proficiency. It was expected that the validity of the score inferences would be similar for 3rd…
Descriptors: Grade 5, Test Format, Inferences, Test Validity
Wise, Lauress – 1993
As high-stakes use of tests increases, it becomes vital that test developers and test users communicate clearly about the accuracy and limitations of the scores generated by a test after it is assembled and used. A procedure is described for portraying the accuracy of test scores. It can be used in setting accuracy targets during form construction…
Descriptors: Classification, High Stakes Tests, Item Response Theory, Military Personnel
Assessing the Effects of Computer Administration on Scores and Parameter Estimates Using IRT Models.
Sykes, Robert C.; And Others – 1991
To investigate the psychometric feasibility of replacing a paper-and-pencil licensing examination with a computer-administered test, a validity study was conducted. The computer-administered test (Cadm) was a common set of items for all test takers, distinct from computerized adaptive testing, in which test takers receive items appropriate to…
Descriptors: Adults, Certification, Comparative Testing, Computer Assisted Testing
Miller, Samuel D.; Smith, Donald E. P. – 1984
To test the assumption that questions measuring literal comprehension and those measuring inferential comprehension are equally valid indices for both oral and silent reading tests at all skill levels, questions from the Analytic Reading Inventory were classified as either literal or inferential. Subjects, 94 children in grades two to five, read…
Descriptors: Differences, Elementary Education, Oral Reading, Reading Ability
Sherman, Lawrence W.; And Others – 1988
A series of six papers and an introduction which present the results and tentative analyses of studies investigating such constructs as self-esteem, perceptions of control, and competence are included in this document. These papers are: (1) "Multiple Dimensions of Locus of Control and Their Relationship To Standardized Achievement Scores in Fifth…
Descriptors: Adolescents, Data Interpretation, Factor Analysis, Factor Structure
Knapp, Deirdre J.; Pliske, Rebecca M. – 1986
A study was conducted to validate the Army's Computerized Adaptive Screening Test (CAST), using data from 2,240 applicants from 60 army recruiting stations across the nation. CAST is a computer-assisted adaptive test used to predict performance on the Armed Forces Qualification Test (AFQT). AFQT scores are computed by adding four subtest scores of…
Descriptors: Adaptive Testing, Adults, Aptitude Tests, Comparative Testing
Henk, William A. – 1983
The specific performance characteristics of eight alternative cloze test formats were examined at the fourth and sixth grade levels. At each grade, 64 subjects were randomly assigned to one of four basic treatments (every-fifth/standard, every-fifth/cued, total random/standard, and total random/cued) and tested. Responses on each of the cloze…
Descriptors: Cloze Procedure, Comparative Analysis, Grade 4, Grade 6
Milton, Ohmer – 1982
Educators are called upon to improve the quality of classroom tests to enhance the learning of content. Less faculty concern for tests than for other features of instruction, compounded by a lack of knowing how to assess different levels of learning with test questions that measure complex processes, appear to generate poor quality classroom…
Descriptors: Educational Testing, Evaluation Methods, Higher Education, Learning Activities


