Publication Date
| In 2026 | 3 |
| Since 2025 | 656 |
| Since 2022 (last 5 years) | 3157 |
| Since 2017 (last 10 years) | 7398 |
| Since 2007 (last 20 years) | 15036 |
Descriptor
| Test Reliability | 15028 |
| Test Validity | 10265 |
| Reliability | 9757 |
| Foreign Countries | 7137 |
| Test Construction | 4821 |
| Validity | 4191 |
| Measures (Individuals) | 3876 |
| Factor Analysis | 3822 |
| Psychometrics | 3520 |
| Interrater Reliability | 3124 |
| Correlation | 3039 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1326 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 251 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 216 |
| California | 214 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Peer reviewedRusson, Craig; Koehly, Laura M. – Evaluation and Program Planning, 1995
A scale was developed for measuring the persuasive impact of qualitative and quantitative evaluation reports on decision makers. Using two exploratory (n=192 graduate and undergraduate students) and two confirmatory (n=200 administrators) samples, researchers developed a 28-item Likert-type scale that demonstrated high reliability and validity.…
Descriptors: Administrators, Attention, College Students, Comprehension
Peer reviewedLinn, Robert L.; Kiplinger, Vonda L. – Applied Measurement in Education, 1995
The adequacy of linking statewide standardized test results to the National Assessment of Educational Progress by using equipercentile equating procedures was investigated using statewide mathematics data from four states. Results suggest that the linkings are not sufficiently trustworthy to make comparisons based on the tails of the distribution.…
Descriptors: Comparative Analysis, Educational Assessment, Equated Scores, Mathematics Tests
Peer reviewedFrisbie, David A. – Educational Measurement: Issues and Practice, 1992
Literature related to the multiple true-false (MTF) item format is reviewed. Each answer cluster of a MTF item may have several true items and the correctness of each is judged independently. MTF tests appear efficient and reliable, although they are a bit harder than multiple choice items for examinees. (SLD)
Descriptors: Achievement Tests, Difficulty Level, Literature Reviews, Multiple Choice Tests
Hoover, John H.; And Others – Education and Training in Mental Retardation, 1992
The development of a structured interview designed to assess leisure satisfaction in persons with mental retardation is described along with initial reliability, validity, and leisure satisfaction findings with 40 individuals with developmental disabilities. Also considered are the rationale for measuring leisure satisfaction based on quality of…
Descriptors: Adolescents, Adults, Interviews, Leisure Time
Peer reviewedSmith, Dwight L. – Journal of Higher Education, 1992
A study analyzed validity and reliability of grades and credits earned by college students in five departments, as indicators of student learning. Results indicate positive, strong correlation between faculty-assigned grades and student performance on external criterion measures. Validity of credits was not as clear. Strong and consistent evidence…
Descriptors: Academic Achievement, College Credits, College Faculty, Comparative Analysis
Peer reviewedCarver, Ronald P. – Educational and Psychological Measurement, 1992
Reliability and validity of a new measure of cognitive speed, the Speed of Thinking Test (SST), were investigated with 129 college students, who also completed a vocabulary test, a test of reading speed, and a test of reading comprehension. The SST appears to be a reliable and valid measure. (SLD)
Descriptors: Cognitive Ability, Cognitive Tests, College Students, Comparative Testing
Peer reviewedTrevisan, Michael S.; And Others – Educational and Psychological Measurement, 1994
The reliabilities of 2-, 3-, 4-, and 5-choice tests were compared through an incremental-option model on a test taken by 154 high school seniors. Creating the test forms incrementally more closely approximates actual test construction. The nonsignificant differences among the option choices support the three-option item. (SLD)
Descriptors: Distractors (Tests), Estimation (Mathematics), High School Students, High Schools
Peer reviewedIrvin, Larry K.; Walker, Hill M. – Exceptional Children, 1994
This article reviews the content and procedural requirements of social competence assessment for children with disabilities and presents information on multiperspective prototype assessments using a videodisc and a microcomputer with a "touch screen." Preliminary psychometric data on sensitivity, reliability, and construct validity are…
Descriptors: Computer Assisted Testing, Disabilities, Educational Technology, Elementary Secondary Education
Peer reviewedKostoff, Ronald N. – Journal of the American Society for Information Science, 1994
Describes the practice of federal evaluation of research impact through three approaches: retrospective methods; qualitative methods, including peer review; and quantitative methods. Recommended areas for study in federal research impact assessment are suggested, including predictive reliability, comparative studies, time and cost estimates,…
Descriptors: Bibliometrics, Comparative Analysis, Costs, Evaluation Methods
Peer reviewedCollins, Angelo – Journal of Personnel Evaluation in Education, 1991
The research of the Biology component of the Teacher Assessment Project (BioTAP) of Stanford (California) University is described. BioTAP uses portfolio development as an important aspect of teacher assessment. Advantages and drawbacks of teacher portfolios are discussed, including issues of validity and reliability. (SLD)
Descriptors: Assessment Centers (Personnel), Biology, Evaluation Methods, High Schools
Peer reviewedAlexander, Cheryl S.; And Others – Journal of Youth and Adolescence, 1990
The development and preliminary testing of a six-item scale to assess risk taking among young adolescents are described. Test construction was based on information provided by eighth graders. The measure, used in a longitudinal study of 758 eighth through tenth graders from 3 rural counties in Maryland, showed good reliability. (SLD)
Descriptors: Adolescents, Attitude Measures, Grade 8, Longitudinal Studies
Peer reviewedJaeger, Richard M. – Educational Measurement: Issues and Practice, 1991
Issues concerning the selection of judges for standard setting are discussed. Determining the consistency of judges' recommendations, or their congruity with other expert recommendations, would help in selection. Enough judges must be chosen to allow estimation of recommendations by an entire population of judges. (SLD)
Descriptors: Cutting Scores, Evaluation Methods, Evaluators, Examiners
Peer reviewedReid, Jerry B. – Educational Measurement: Issues and Practice, 1991
Training judges to generate item ratings in standard setting once the reference group has been defined is discussed. It is proposed that sensitivity to the factors that determine difficulty can be improved through training. Three criteria for determining when training is sufficient are offered. (SLD)
Descriptors: Computer Assisted Instruction, Difficulty Level, Evaluators, Interrater Reliability
Peer reviewedHarvill, Leo M. – Educational Measurement: Issues and Practice, 1991
This paper discusses standard error of measurement (SEM), the amount of variation or spread in the measurement errors for a test, and gives information needed to interpret test scores using SEMs. SEMs at various score levels should be used in calculating score bands rather than a single SEM value. (SLD)
Descriptors: Definitions, Equations (Mathematics), Error of Measurement, Estimation (Mathematics)
Peer reviewedByrne, Brian; Fielding-Barnsley, Ruth – Journal of Educational Psychology, 1990
Results of 6 experiments with 109 Australian preschool children favor training in phoneme identity over segmentation as a component of initial reading instruction because it is easier to implement and its relation to alphabetic insight is stronger. Implications for the initial reading curriculum are discussed. (SLD)
Descriptors: Alphabets, Beginning Reading, Curriculum Development, Foreign Countries


