Publication Date
| In 2026 | 6 |
| Since 2025 | 934 |
| Since 2022 (last 5 years) | 4543 |
| Since 2017 (last 10 years) | 10476 |
| Since 2007 (last 20 years) | 21939 |
Descriptor
| Test Validity | 21771 |
| Validity | 13783 |
| Test Reliability | 10853 |
| Foreign Countries | 9876 |
| Test Construction | 6891 |
| Factor Analysis | 5760 |
| Measures (Individuals) | 5627 |
| Predictive Validity | 5021 |
| Psychometrics | 4813 |
| Reliability | 4634 |
| Correlation | 4375 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 1169 |
| Practitioners | 629 |
| Teachers | 336 |
| Administrators | 165 |
| Policymakers | 110 |
| Counselors | 63 |
| Students | 63 |
| Parents | 15 |
| Community | 12 |
| Media Staff | 10 |
| Support Staff | 8 |
| More ▼ | |
Location
| Turkey | 1395 |
| Australia | 705 |
| Canada | 626 |
| China | 528 |
| United States | 439 |
| Indonesia | 389 |
| United Kingdom | 363 |
| California | 338 |
| Germany | 338 |
| Netherlands | 335 |
| Spain | 310 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 7 |
| Meets WWC Standards with or without Reservations | 12 |
| Does not meet standards | 10 |
Peer reviewedForsyth, Robert A.; And Others – Applied Measurement in Education, 1992
Two criteria defined in previous research that can be used to evaluate the validity of normative data provided for customized tests are discussed. Results of an exploratory investigation of the validity of such data for about 2,500 fifth graders in a 1989 study are reported. (SLD)
Descriptors: Adaptive Testing, Elementary School Students, Evaluation Criteria, Evaluation Methods
Alessi, Stephen M.; Johnson, Lynn A. – Simulation/Games for Learning, 1992
Discussion of the use of simulations for licensure testing highlights the Dental Interactive Simulations Corporation (DISC) project that uses interactive video patient simulations for dental education and licensure. Topics addressed include reliability, validity, test administration issues, effects of fidelity on reliability and validity, and…
Descriptors: Computer Assisted Instruction, Computer Simulation, Dental Students, Dentistry
Peer reviewedTirre, William C.; Pena, Carmen M. – Journal of Educational Psychology, 1992
Two experiments with approximately 377 newly enlisted Air Force personnel and 182 college students investigated the validity of a reading span test combining a knowledge verification task with a word memorization task. Results support the hypothesis that word recall reflects the amount of working memory functional in reading. (SLD)
Descriptors: College Students, Comparative Testing, Higher Education, Knowledge Level
Worthen, Blaine R. – Phi Delta Kappan, 1993
Describes how alternative assessment differs from more traditional forms and outlines the forces causing the recent fascination with alternative assessment (demands for accountability, negative consequences of high-stakes testing, and increasing criticisms of standardized tests). Identifies some major issues involving alternative assessment,…
Descriptors: Accountability, Alternative Assessment, Competency Based Education, Elementary Secondary Education
Peer reviewedFedoruk, Genevieve M.; Norman, Charles A. – Exceptional Children, 1991
The study evaluated how 21 first grade teachers differed in preferences, requirements, and expectations of students. Teachers ranked 86 student descriptors on a continuum of contributing to either student success or failure. Teachers were found to vary considerably in descriptor rankings, suggesting that teacher variations may be a factor in the…
Descriptors: Grade 1, Individual Differences, Kindergarten, Predictive Measurement
Peer reviewedAlbanese, Mark A. – Academic Medicine, 1991
A study compared student and trained observer ratings of 15 high-rated and 15 low-rated lecturers in a multi-instructor medical course to identify distinguishing delivery characteristics. Student ratings were stable over three years; trained observers discriminated between students' highest- and lowest-rated lecturers. Voice presentation was the…
Descriptors: Faculty Evaluation, Higher Education, Interrater Reliability, Medical Education
Peer reviewedKarras, Ray W. – OAH Magazine of History, 1991
Comments that multiple-choice tests are objective, test some knowledge, and are easy to grade, but often ask for little more than rote recall. Offers a structure for multiple-choice questions that require evaluative thinking skills as well as knowledge of the facts. Includes discussion of objectivity, preparation, and memorization. (DK)
Descriptors: Elementary Secondary Education, Evaluative Thinking, History Instruction, Memorization
Peer reviewedMcNair, Jeff; Rusch, Frank R. – Career Development for Exceptional Individuals, 1992
The Co-worker Involvement Instrument measures involvement with supported employees with disabilities, via items pertaining to physical integration, social integration, vocational integration, training, associating frequency, associating nature, befriending, advocating, evaluating, and information giving. The instrument was found to be a reliable…
Descriptors: Adults, Behavior Rating Scales, Disabilities, Employees
Hay, Tina M. – Currents, 1992
Although higher education institutions dislike rankings published in the mass media, they like the attention the rankings create and prefer to be included rather than excluded. Common criticisms of the methodology include emphasis on inappropriate criteria, unfair comparison of private and public institutions, faulty assumptions, inaccurate data,…
Descriptors: Comparative Analysis, Evaluation Criteria, Higher Education, Mass Media
Peer reviewedMorrison, Judith A.; Shriberg, Lawrence D. – Journal of Speech and Hearing Research, 1992
Speech analyses performed on data from 61 speech-delayed children (ages 3-6) found that, in comparison to the validity of conversational speech samples for integrated speech, language, and prosodic analyses, articulation tests appear to yield neither typical nor optimal measures of speech performance. (Author/JDD)
Descriptors: Articulation (Speech), Communicative Competence (Languages), Evaluation Methods, Measures (Individuals)
Peer reviewedBall, Martin J.; And Others – Journal of Communication Disorders, 1991
This study investigated two pragmatic profiles (the Pragmatic Profile and the Profile of Communicative Appropriateness) used to assess the language of two aphasic patients. The study examined interscorer reliability, scoring sensitivity, and diagnostic accuracy. Findings indicate that training in scoring these profiles must be uniform, and greater…
Descriptors: Adults, Aphasia, Behavior Rating Scales, Communication Disorders
Peer reviewedCorbeil, Giselle – Canadian Modern Language Review, 1992
A scale of processes found most constructive for second-language learning was validated with two groups of adult French students. Instruction of one group included use of the scale. Pre- and posttests of language learning and use of processes suggest that unsuccessful language learners can be taught to improve performance. (31 references)…
Descriptors: Adult Students, Classroom Communication, Classroom Techniques, Evaluation Methods
Peer reviewedPinchas, Tamir; Frankl, Dida – International Journal of Science Education, 1991
Course of study, target student population, structure of matriculation examination, student outcomes and attitudes of student toward various components of this examination are described. It was found that this examination not only offers a fair, educationally valid assessment but also provides support and direction which result in more effective…
Descriptors: Admission (School), Biology, Course Descriptions, Cultural Influences
Peer reviewedCarey, Martha Ann; Smith, Mickey W. – Evaluation and the Health Professions, 1992
The use of qualitative data in the refinement of a research program in human immunodeficiency virus studies in a military population is described. Three mechanisms of patient participation (a protocol advisor, a participant advisory panel, and focus groups) provided important feedback for adapting the research process. (SLD)
Descriptors: Acquired Immune Deficiency Syndrome, Advisory Committees, Feedback, Military Personnel
Peer reviewedKim, Seock-Ho; Cohen, Allan S. – Journal of Educational Measurement, 1992
Effects of the following methods for linking metrics on detection of differential item functioning (DIF) were compared: (1) test characteristic curve method (TCC); (2) weighted mean and sigma method; and (3) minimum chi-square method. With large samples, results were essentially the same. With small samples, TCC was most accurate. (SLD)
Descriptors: Chi Square, Comparative Analysis, Equations (Mathematics), Estimation (Mathematics)


