Publication Date
| In 2026 | 0 |
| Since 2025 | 27 |
| Since 2022 (last 5 years) | 113 |
| Since 2017 (last 10 years) | 280 |
| Since 2007 (last 20 years) | 517 |
Descriptor
| Testing Problems | 4850 |
| Elementary Secondary Education | 1262 |
| Test Validity | 1008 |
| Test Construction | 801 |
| Standardized Tests | 790 |
| Higher Education | 658 |
| Test Reliability | 607 |
| Student Evaluation | 583 |
| Testing | 564 |
| Test Bias | 562 |
| Achievement Tests | 555 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 248 |
| Researchers | 220 |
| Teachers | 81 |
| Administrators | 35 |
| Policymakers | 34 |
| Parents | 15 |
| Counselors | 13 |
| Students | 5 |
| Community | 3 |
| Support Staff | 2 |
Location
| Canada | 52 |
| Australia | 45 |
| California | 44 |
| United Kingdom | 37 |
| United States | 36 |
| United Kingdom (England) | 31 |
| China | 29 |
| Netherlands | 26 |
| Florida | 25 |
| New York | 25 |
| United Kingdom (Great Britain) | 24 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards with or without Reservations | 1 |
Thrash, Susan K.; Porter, Andrew C. – 1974
The purpose of this paper is to prove that one currently recommended method of obtaining the reliability of an instrument defined on a population of aggregate units is invalid. This method randomly splits the aggregate into two halves, correlates the two half unit scores by a Pearson product moment correlation coefficient, and corrects the…
Descriptors: Comparative Analysis, Correlation, Measurement Techniques, Sampling
Radocy, Rudolf E. – 1974
A need for measurement exists not only in academic events but also in the affective domain. The author presents a procedure for quantification of affective behavior. The procedure contains three stages. One is the conceptualization stage in which personal meanings of elements in the affective domain are examined. Another is the crucial…
Descriptors: Affective Behavior, Affective Measures, Measurement Techniques, Test Construction
Fitzgibbon, Thomas J. – 1975
In this speech given at the 1976 annual breakfast of the National Council on Measurement in Education Dr. Thomas J. Fitzgibbon, outgoing president of NCME, responds to critics of standardized testing and outlines the correct uses for it. He believes that many criticisms of standardized testing are due to misunderstanding or a lack of information.…
Descriptors: Achievement Tests, Standardized Tests, Test Reliability, Test Validity
DeGracie, James S.; Vicino, Frank L. – Educational Technology, 1977
Categories of questionnaire response sets and the ability to interpret response differences as related to soliciting student attitudes. (DAG)
Descriptors: Questioning Techniques, Questionnaires, Response Style (Tests), Student Attitudes
McVey, P. J. – Assessment in Higher Education, 1976
The results of 16 pairs of "equivalent papers" were used to estimate the reliability of the papers and the extent to which each paper correlated with the year's average test grade. Estimates were also made of the work of the grade for each paper as a predictor of true subject grades. It is shown that a "profile" of grades would mislead.…
Descriptors: Grades (Scholastic), Higher Education, Profiles, Reliability
Peer reviewedWishart, Jennifer G. – American Journal of Mental Deficiency, 1987
Twelve children with Down's syndrome (ages 3-5 years) were tested six times over 2.5 months on three Piagetian infant search tasks. Results suggested that cognitive ability of this population may be poorly measured by single-session testing and that caution is necessary when using tests designed for and validated on younger, nonretarded subjects.…
Descriptors: Cognitive Measurement, Downs Syndrome, Test Reliability, Testing Problems
Peer reviewedPhillips, S. E.; Clarizio, Harvey F. – Educational Measurement: Issues and Practice, 1988
Two major problems related to the identification of learning disabilities with individually administered achievement tests are discussed: (1) the appropriateness of standard versus developmental scores for determining the severity of discrepancy; and (2) the limitations of existing developmental score scales. Characteristics of the developmental…
Descriptors: Achievement Tests, Diagnostic Tests, Learning Disabilities, Scores
Peer reviewedMasters, Geofferey N. – Journal of Educational Measurement, 1988
High item discrimination can indicate a special kind of measurement disturbance via an item that gives high-ability persons a special advantage. The measurement disturbance is described, which occurs when an item is sensitive to individual differences on a second, undesired dimension that is correlated with the variable intended to be measured.…
Descriptors: Academically Gifted, Item Analysis, Test Bias, Test Wiseness
Peer reviewedOsgood, Robert L. – Learning Disability Quarterly, 1984
The article reviews the origins of the intelligence testing movement in the U.S., discusses the difficulties inherent in measuring intelligence, and considers alternatives to current LD identification procedures. (CL)
Descriptors: Disability Identification, History, Intelligence, Intelligence Tests
Peer reviewedSchulte, Ann; Borich, Gary D. – Journal of School Psychology, 1984
Presents reliability and standard error of measurement figures for several combinations of ability and achievement measures. Discusses the rates and types of errors that occur when such scores are used to classify children as learning-disabled. Three recommendations for using difference scores are given. (BH)
Descriptors: Children, Educational Diagnosis, Elementary Secondary Education, Learning Disabilities
Peer reviewedWilson, Margo E.; Byrne, Margaret C. – Journal of Communication Disorders, 1984
A reward was hidden under the stimulus picture representing the correct answer to a comprehension task. Accuracy and test-retest reliability of the responses of 34 two year olds were measured. Effects of the rewarded search procedure varied, depending on sex and the language structure tested. (Author/CL)
Descriptors: Comprehension, Language Acquisition, Reinforcement, Test Reliability
Peer reviewedEvans, William – Journal of Experimental Education, 1984
The capacity of examinees to develop cue-using strategies was examined, and the results suggest that students profit from knowledge of a particular test constructor's idiosyncrasies. The findings also lend weight to the argument that performance on test wiseness items is cue-specific. (Author/BW)
Descriptors: Adults, Cues, Test Construction, Test Items
Peer reviewedGaffney, Richard F.; Maguire, Thomas O. – Journal of Educational Measurement, 1971
Descriptors: Elementary School Students, Scores, Test Validity, Test Wiseness
Peer reviewedSherrill, David; And Others – American Educational Research Journal, 1971
Descriptors: Cheating, College Students, Perception, Student Attitudes
Peer reviewedLusk, Edward J.; Wright, Haviland – Perceptual and Motor Skills, 1981
Results are presented which suggest that the learning occurring between two sections of the Group Embedded Fiqures Test is independent of the order in which the sections are worked. (Author/GK)
Descriptors: Comparative Analysis, Higher Education, Learning, Scores


