Publication Date
| In 2026 | 3 |
| Since 2025 | 675 |
| Since 2022 (last 5 years) | 3176 |
| Since 2017 (last 10 years) | 7417 |
| Since 2007 (last 20 years) | 15055 |
Descriptor
| Test Reliability | 15043 |
| Test Validity | 10279 |
| Reliability | 9761 |
| Foreign Countries | 7144 |
| Test Construction | 4825 |
| Validity | 4191 |
| Measures (Individuals) | 3877 |
| Factor Analysis | 3825 |
| Psychometrics | 3526 |
| Interrater Reliability | 3124 |
| Correlation | 3040 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1328 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 253 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 217 |
| California | 215 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Peer reviewedFleming, Dan B. – Peabody Journal of Education, 1977
Descriptors: Accountability, Evaluation Methods, Social Studies, Standardized Tests
Peer reviewedSawin, Enoch I. – Studies in Educational Evaluation, 1976
Problems associated with current expertise in evaluation are discussed. Since evaluators are not always able to reliably achieve all levels of an evaluation project, these tasks are categorized into five levels of complexity. The author suggests a more accurate label for evaluators, "descriptive inquiry specialists," and includes guidelines for…
Descriptors: Curriculum Evaluation, Elementary Secondary Education, Evaluation Criteria, Evaluation Methods
Peer reviewedHanna, Gerald S. – Journal of Educational Measurement, 1977
The effects of providing total and partial immediate feedback to pupils in multiple choice testing was investigated with fifth and sixth grade pupils. The split-half reliability was higher with total feedback than with no feedback. Concurrent validity with a completion test showed all three settings to be nearly identical. (Author/JKS)
Descriptors: Elementary Education, Elementary School Students, Feedback, Forced Choice Technique
Peer reviewedLord, Frederic M. – Journal of Educational Measurement, 1977
Two approaches for determining the optimal number of choices for a test item, presently in the literature, are compared with two new approaches. (Author)
Descriptors: Forced Choice Technique, Latent Trait Theory, Multiple Choice Tests, Test Items
Peer reviewedLord, Frederic M. – Journal of Educational Measurement, 1977
A variety of practical applications of item characteristic curve test theory are discussed. Among these applications are tailored testing, two stage testing, determining whether two tests measure the same latent trait, and measuring item bias towards minority or other groups. (Author/JKS)
Descriptors: Computer Programs, Latent Trait Theory, Mastery Tests, Measurement
Peer reviewedBurns, Edward – Journal of School Psychology, 1977
Studied the degree to which skewed score distributions can affect the interpretation of Illinois Test of Psycholinguistic Abilities (ITPA) Results suggest indices of score variability such as average deviation and standard scores must be interpreted with extreme caution when skewness is a significant factor. (Author)
Descriptors: Diagnostic Tests, Individual Psychology, Perception Tests, Psycholinguistics
Peer reviewedArndt, William B. – Journal of Speech and Hearing Disorders, 1977
In evaluating the Northwestern Syntax Screening Test (a test for assessing expressive and receptive grammar in preschool and primary age children), the author points out problems with the test norms, reliability, and validity. (SBH)
Descriptors: Early Childhood Education, Grammar, Language Tests, Screening Tests
Peer reviewedByrne, Margaret C. – Journal of Speech and Hearing Disorders, 1977
The author responds to W. Arndt's criticisms of the Northwestern Syntax Screening Test, a test for assessing receptive and expressive grammar in young children. (SBH)
Descriptors: Early Childhood Education, Grammar, Language Tests, Screening Tests
Peer reviewedCairns, E. – British Journal of Educational Psychology, 1977
It would appear that there is a lack of convincing evidence especially regarding the reliability of the Matching Familiar Figures test over short intervals and with older children. As the test is now being used for diagnostic purposes in education, more information is required, and here the MFF is examined in older children using a split-half…
Descriptors: Cognitive Ability, Educational Psychology, Elementary School Students, Information Processing
Peer reviewedHawthorne, Linda White; Larsen, Stephen C. – Journal of Learning Disabilities, 1977
Descriptors: Exceptional Child Research, Kindergarten, Learning Disabilities, Prediction
Peer reviewedBrown, Eleese V. – Perceptual and Motor Skills, 1977
Descriptors: Early Childhood Education, Elementary Education, Freehand Drawing, General Education
Peer reviewedSroufe, L. Alan – Child Development, 1977
This article reviews the literature on infants' reactions to strangers, focusing on issues of assessment, reliability, and stability. (JMB)
Descriptors: Developmental Stages, Infant Behavior, Infants, Literature Reviews
Peer reviewedKavale, Kenneth; Hirshoren, Alfred – Reading Horizons, 1977
Examines why profile analysis "adds spurious specificity and misarticulated authority to quasi-diagnostic statements." Concludes that the difficulties inherent in diagnostic reading tests suggest that they are best utilized in comparing a child's performance with a norm group. (JM)
Descriptors: Diagnostic Tests, Profiles, Reading Diagnosis, Reading Tests
Peer reviewedPyrczak, Fred – Journal of Reading, 1977
Students know more than they think they know, so guessing gives better scores even when there's a penalty for errors. (JM)
Descriptors: Guessing (Tests), Multiple Choice Tests, Reading Research, Reading Tests
Peer reviewedTollefson, Nona – Educational and Psychological Measurement, 1987
This study compared the item difficulty, item discrimination, and test reliability of three forms of multiple-choice items: (1) one correct answer; (2) "none of the above" as a foil; and (3) "none of the above" as the correct answer. Twelve items in the three formats were administered in a college statistics examination. (BS)
Descriptors: Difficulty Level, Higher Education, Item Analysis, Multiple Choice Tests


