Publication Date
| In 2026 | 0 |
| Since 2025 | 5 |
| Since 2022 (last 5 years) | 45 |
| Since 2017 (last 10 years) | 91 |
| Since 2007 (last 20 years) | 144 |
Descriptor
| Test Format | 418 |
| Test Reliability | 418 |
| Test Validity | 243 |
| Test Construction | 135 |
| Test Items | 119 |
| Higher Education | 88 |
| Multiple Choice Tests | 68 |
| Foreign Countries | 67 |
| Testing | 65 |
| Test Interpretation | 61 |
| Comparative Analysis | 57 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 33 |
| Teachers | 23 |
| Administrators | 18 |
| Researchers | 12 |
| Community | 1 |
| Counselors | 1 |
| Policymakers | 1 |
| Students | 1 |
| Support Staff | 1 |
Location
| New York | 9 |
| Turkey | 8 |
| California | 7 |
| Canada | 6 |
| Japan | 6 |
| Germany | 4 |
| United Kingdom | 4 |
| Georgia | 3 |
| Israel | 3 |
| France | 2 |
| Indonesia | 2 |
| More ▼ | |
Laws, Policies, & Programs
| Individuals with Disabilities… | 1 |
| Job Training Partnership Act… | 1 |
| No Child Left Behind Act 2001 | 1 |
| Pell Grant Program | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedDonders, Jacques – Psychological Assessment, 1997
Eight subtests were selected from the Wechsler Intelligence Scale for Children--Third Edition (WISC-III) to make a short form for clinical use. Results with the 2,200 children from the WISC-III standardization sample indicated the adequate reliability and validity of the short form for clinical use. (SLD)
Descriptors: Children, Clinical Diagnosis, Intelligence Tests, Test Format
Peer reviewedAxelrod, Bradley N.; And Others – Psychological Assessment, 1996
The calculations of D. Schretlen, R. H. B. Benedict, and J. H. Bobholz for the reliabilities of a short form of the Wechsler Adult Intelligence Scale--Revised (WAIS-R) (1994) consistently overestimated the values. More accurate values are provided for the WAIS--R and a seven-subtest short form. (SLD)
Descriptors: Error Correction, Error of Measurement, Estimation (Mathematics), Intelligence Tests
Lundervold, Duane A.; Dunlap, Angel L. – International Journal of Behavioral Consultation and Therapy, 2006
Alternate forms reliability of the Behavioral Relaxation Scale (BRS; Poppen,1998), a direct observation measure of relaxed behavior, was examined. A single BRS score, based on long duration observation (5-minute), has been found to be a valid measure of relaxation and is correlated with self-report and some physiological measures. Recently,…
Descriptors: Test Format, Intervals, Observation, Measures (Individuals)
Schuldberg, David – 1988
Indices were constructed to measure individual differences in the effects of the automated testing format and repeated testing on Minnesota Multiphasic Personality Inventory (MMPI) responses. Two types of instability measures were studied within a data set from the responses of 150 undergraduate students who took a computer-administered and…
Descriptors: College Students, Computer Assisted Testing, Higher Education, Individual Differences
Fishman, Judith – Writing Program Administration, 1984
Examines the CUNY-WAT program and questions many aspects of it, especially the choice and phrasing of topics. (FL)
Descriptors: Essay Tests, Higher Education, Test Format, Test Items
Peer reviewedStiggins, Richard J. – Research in the Teaching of English, 1982
Compares direct and indirect writing assessment strategies and contrasts them in terms of the relationship each has to specific classroom decision-making situations, the components of writing assessed, practical testing matters, characteristics of test exercises, test scoring procedures, and procedures for determining test quality. (HOD)
Descriptors: Comparative Analysis, Decision Making, Educational Assessment, Test Format
Peer reviewedBerk, Ronald A. – Journal of Educational Measurement, 1980
A dozen different approaches that yield 13 reliability indices for criterion-referenced tests were identified and grouped into three categories: threshold loss function, squared-error loss function, and domain score estimation. Indices were evaluated within each category. (Author/RL)
Descriptors: Classification, Criterion Referenced Tests, Cutting Scores, Evaluation Methods
Peer reviewedWainer, Howard; Lukhele, Robert – Educational and Psychological Measurement, 1997
The reliability of scores from four forms of the Test of English as a Foreign Language (TOEFL) was estimated using a hybrid item response theory model. It was found that there was very little difference between overall reliability when the testlet items were assumed to be independent and when their dependence was modeled. (Author/SLD)
Descriptors: English (Second Language), Item Response Theory, Scores, Second Language Learning
Peer reviewedMelancon, Janet G.; Thompson, Bruce – Psychology in the Schools, 1989
Investigated measurement characteristics of both forms of Finding Embedded Figures Test (FEFT). College students (N=302) completed both forms of FEFT or one form of FEFT and Group Embedded Figures Test. Results suggest that FEFT forms provide reasonable reliable and valid data. (Author/NB)
Descriptors: College Students, Field Dependence Independence, Higher Education, Multiple Choice Tests
Perez, Christina – Journal of College Admission, 2002
Spurred in part by University of California (UC) President Richard Atkinson's February 2001 proposal to drop the SAT I for UC applicants, more attention is being paid to other tests such as the SAT II and ACT. Proponents of these alternative exams argue that the SAT I is primarily an aptitude test measuring some vague concept of "inherent…
Descriptors: College Entrance Examinations, Test Reliability, Academic Achievement, Prediction
Guess, Pamela – Journal of Psychoeducational Assessment, 2006
The OMNI Personality Inventory (OMNI) is a self-report questionnaire designed for use with adolescents and adults between 18 and 74 years of age. The questionnaire is not based on a particular theory, consistent with current trends in test development, according to the author. An abbreviated form of the OMNI, the OMNI-IV Personality Disorder…
Descriptors: Personality Measures, Questionnaires, Adolescents, Adults
Boldt, R. F. – 1992
The Test of Spoken English (TSE) is an internationally administered instrument for assessing nonnative speakers' proficiency in speaking English. The research foundation of the TSE examination described in its manual refers to two sources of variation other than the achievement being measured: interrater reliability and internal consistency.…
Descriptors: Adults, Analysis of Variance, Interrater Reliability, Language Proficiency
Lowe, Pardee, Jr.; Liskin-Gasparro, Judith E. – 1986
The oral interview (OI) is a testing procedure that measures a wide range of speaking abilities in a foreign language. Although somewhat different versions are used in different testing situations, the OI always consists of a structured, face-to-face conversation on a variety of topics between a student and one or two testers. The resulting speech…
Descriptors: Interviews, Language Proficiency, Language Tests, Oral Language
Ebel, Robert L. – 1981
An alternate-choice test item is a simple declarative sentence, one portion of which is given with two different wordings. For example, "Foundations like Ford and Carnegie tend to be (1) eager (2) hesitant to support innovative solutions to educational problems." The examinee's task is to choose the alternative that makes the sentence…
Descriptors: Comparative Testing, Difficulty Level, Guessing (Tests), Multiple Choice Tests
Peer reviewedSmith, Gudmund J. W.; Carlsson, Ingegerd – Journal of Creative Behavior, 1987
A new creativity test is described that is based on the percept-genetic theory, which presumes that percepts are built by ultra-short, mostly preconscious processes. The test uses a still-life with two main structures, which is shown in two series of presentations differing in length. Results of experiments using the test are described. (Author/KM)
Descriptors: Creative Thinking, Creativity, Creativity Research, Creativity Tests

Direct link
