Publication Date
| In 2026 | 0 |
| Since 2025 | 10 |
| Since 2022 (last 5 years) | 54 |
| Since 2017 (last 10 years) | 97 |
| Since 2007 (last 20 years) | 163 |
Descriptor
| Test Format | 506 |
| Test Validity | 506 |
| Test Reliability | 243 |
| Test Construction | 180 |
| Test Items | 127 |
| Foreign Countries | 108 |
| Language Tests | 96 |
| Higher Education | 86 |
| Testing | 80 |
| Computer Assisted Testing | 72 |
| Test Use | 67 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Higher Education | 60 |
| Postsecondary Education | 50 |
| Secondary Education | 30 |
| Elementary Education | 25 |
| Middle Schools | 19 |
| Junior High Schools | 15 |
| High Schools | 13 |
| Grade 8 | 11 |
| Grade 4 | 9 |
| Elementary Secondary Education | 8 |
| Grade 5 | 8 |
| More ▼ | |
Audience
| Practitioners | 30 |
| Teachers | 19 |
| Administrators | 17 |
| Researchers | 9 |
| Community | 1 |
| Policymakers | 1 |
| Students | 1 |
| Support Staff | 1 |
Location
| Canada | 10 |
| China | 9 |
| New York | 9 |
| Japan | 7 |
| Netherlands | 6 |
| Germany | 5 |
| Turkey | 5 |
| United Kingdom | 5 |
| United Kingdom (England) | 5 |
| Australia | 4 |
| Georgia | 4 |
| More ▼ | |
Laws, Policies, & Programs
| Elementary and Secondary… | 1 |
| Individuals with Disabilities… | 1 |
| Job Training Partnership Act… | 1 |
| No Child Left Behind Act 2001 | 1 |
| Pell Grant Program | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Braun, Carl; And Others – 1986
Eight fifth grade students participated in a study that investigated the effects of test context on test performance and subjects' perceptions of tests in varying contexts. In the first three contexts, reading of words in isolation was tested using materials drawn from three different informal reading inventories equivalent in difficulty level.…
Descriptors: Context Clues, Evaluation Methods, Grade 5, Intermediate Grades
Hambleton, Ronald K. – 1986
The problem of determining optimal test lengths with fixed total testing time has proved to be a difficult one for criterion-referenced test developers. An algorithm is needed which can be used by test developers to allocate available testing time to maximize the validity of their total criterion-referenced tests or testing programs. To be…
Descriptors: Algorithms, Criterion Referenced Tests, Elementary Secondary Education, Psychometrics
Federico, Pat-Anthony – 1989
To determine the relative reliabilities and validities of paper-based and computer-based measurement procedures, 83 male student pilots and radar intercept officers were administered computer and paper-based tests of aircraft recognition. The subject matter consisted of line drawings of front, side, and top silhouettes of aircraft. Reliabilities…
Descriptors: Armed Forces, Comparative Analysis, Computer Assisted Testing, Discriminant Analysis
North Carolina State Dept. of Public Instruction, Raleigh. Div. of Research. – 1988
The North Carolina Test of Algebra II was developed for use as an achievement test following the completion of the Algebra II course of study. Its design serves two purposes: (1) a normative measure of student achievement; and (2) an objective-based measurement of curriculum coverage. The test's curricular validity, content validity, instructional…
Descriptors: Achievement Tests, Algebra, Curriculum Evaluation, Mathematics Achievement
O'Reilly, Patrick A.; And Others – 1980
After introductory material examining student skepticism toward testing and outlining the major problems encountered by instructors in test construction and grading, this monograph explores the optimum use of testing and performance evaluations in competency-based vocational education programs. Background information is then presented, including…
Descriptors: Community Colleges, Competency Based Education, Evaluation Methods, Grades (Scholastic)
Wiener, Harvey S. – Writing Program Administration Journal, 1986
Suggests that the current tests for writers form a paradigm for how other liberal studies faculties can shape assessment programs that reflect the important tenets of their disciplines, because these tests have grown from sound academic principles, rather than from the expediencies of budget offices, campus testing services, or legislative…
Descriptors: Criterion Referenced Tests, Elementary Secondary Education, Liberal Arts, Postsecondary Education
Peer reviewedKinicki, Angelo J.; And Others – Educational and Psychological Measurement, 1985
Using both the Behaviorally Anchored Rating Scales (BARS) and the Purdue University Scales, 727 undergraduates rated 32 instructors. The BARS had less halo effect, more leniency error, and lower interrater reliability. Both formats were valid. The two tests did not differ in rate discrimination or susceptibility to rating bias. (Author/GDC)
Descriptors: Behavior Rating Scales, College Faculty, Comparative Testing, Higher Education
Sireci, Stephen G.; Foster, David F.; Robin, Frederic; Olsen, James – 1997
Evaluating the comparability of a test administered in different languages is a difficult, if not impossible, task. Comparisons are problematic because observed differences in test performance between groups who take different language versions of a test could be due to a difference in difficulty between the tests, to cultural differences in test…
Descriptors: Adaptive Testing, Adults, Certification, Comparative Analysis
Kopriva, Rebecca; Sexton, Ursula M. – 1999
To date, little work has been done to ensure limited English proficient (LEP) students are accurately assessed on a large scale. The purpose of this guide is to help scorers in high volume situations to be able to effectively evaluate the open-ended responses of this population. Section one of this guide presents a brief overview of the State…
Descriptors: English (Second Language), Examiners, Factor Analysis, Limited English Speaking
Peer reviewedAikenhead, Glen S. – Journal of Research in Science Teaching, 1988
Explores the sources of beliefs about science-technology-society topics and investigates the degree of ambiguity harbored by four different response modes, such as Likert-type, written paragraph, semistructured interview, and empirically developed multiple choice. Finds that mass media have a greater impact than science classes. (Author/YP)
Descriptors: Attitude Measures, Beliefs, Foreign Countries, Science and Society
Bruno, Rachelle M.; Walker, Stephen C. – Diagnostique, 1999
This article describes the Comprehensive Test of Phonological Processing, an assessment instrument designed to assess the phonological processing skills of individuals between the ages of 5 and 24.11 years and to identify those with phonological processing difficulties. Its administration, standardization, reliability, and validity are discussed.…
Descriptors: Adolescents, Children, Disability Identification, Language Impairments
Bachor, Dan G. – Diagnostique, 1999
This article reviews the revised KeyMath Revised-Normative Update American and Canadian editions, and a draft version of the 1999 pending revision. The assessment instrument is intended to test basic mathematical concepts, operations, and applications of students in grades K-12. Its administration, standardization, reliability, and validity are…
Descriptors: Disabilities, Elementary Secondary Education, Foreign Countries, Mathematical Concepts
Embretson, Susan E. – Measurement: Interdisciplinary Research and Perspectives, 2004
The last century was marked by dazzling changes in many areas, such as technology and communications. Predictions into the second century of testing are seemingly difficult in such a context. Yet, looking back to the turn of the last century, Kirkpatrick (1900), in his American Psychological Association presidential address, presented fundamental…
Descriptors: Ability, Testing, Futures (of Society), Psychometrics
Schoenfeld, Alan H. – Measurement: Interdisciplinary Research and Perspectives, 2007
The authors of this volume's stimulus papers have taken on the challenge of developing measures of teachers' mathematical knowledge for teaching (MKT). This task involves multiple decisions and considerations, including: (1) How does one specify the body of knowledge being assessed? What warrants are offered for those choices?; (2) How does one…
Descriptors: Test Validity, Psychometrics, Test Construction, Evaluation Research
Larson, Jerry W. – 1985
A study at Brigham Young University (Utah) investigated the feasibility of computer-assisted language placement testing in higher education. Benefits and problems of this approach for test administration, individualization of item selection, and recordkeeping were examined. Four steps were followed in production of a test for Spanish placement:…
Descriptors: College Second Language Programs, Computer Assisted Testing, Higher Education, Language Tests

Direct link
