Publication Date
| In 2026 | 7 |
| Since 2025 | 690 |
| Since 2022 (last 5 years) | 3191 |
| Since 2017 (last 10 years) | 7432 |
| Since 2007 (last 20 years) | 15070 |
Descriptor
| Test Reliability | 15055 |
| Test Validity | 10290 |
| Reliability | 9763 |
| Foreign Countries | 7150 |
| Test Construction | 4828 |
| Validity | 4192 |
| Measures (Individuals) | 3880 |
| Factor Analysis | 3826 |
| Psychometrics | 3532 |
| Interrater Reliability | 3126 |
| Correlation | 3040 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1329 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 253 |
| Taiwan | 234 |
| Netherlands | 224 |
| Spain | 218 |
| California | 215 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Westbrook, Bert W. – Measurement and Evaluation in Guidance, 1976
The purpose of this study was to determine whether ninth-grade pupils making appropriate vocational choices attained higher scores on measures of vocational maturity than pupils making inappropriate vocational choices. Analyses of the results showed that the group making appropriate vocational choices attained significantly higher scores than the…
Descriptors: Career Choice, High School Students, Measurement Techniques, Rating Scales
Lauer, Patricia A. – Mid-continent Research for Education and Learning (McREL), 2004
The goal of this primer is to help policymakers and other interested individuals answer three big questions: (1) What does the research say? (2) Is the research trustworthy? (3) How can the research be used to guide policy? Answering these questions will help policymakers: (1) make evidenced-based decisions about education policies; (2) gain a…
Descriptors: Scientific Research, Research Methodology, Research Utilization, Educational Research
Gardner, John, Ed. – Paul Chapman Publishing, 2006
In most developed countries, the pursuit of reliable and valid means of assessing people's learning generates high volumes of published discourse and, not infrequently, dissent; the documentation on the various assessment policies, practices and theories could conceivably fill whole libraries. Some of the discourse and much of the dissent relate…
Descriptors: Learning, Student Evaluation, Formative Evaluation, Summative Evaluation
Ramlo, Susan – 2002
The Force and Motion Conceptual Evaluation (FMCE) is a multiple-choice test that has been used to evaluate physics instruction. However, the validity and reliability estimates have not been determined in a way a social scientist would expect. Few psychometric data were used to estimate the validity and reliability of the FMCE instrument. This…
Descriptors: College Students, Concept Teaching, Force, Higher Education
Howley, Caitlin; Riffle, Joy – 2002
A pilot version of a School Capacity Assessment (SCA) was developed in 2002 to assess the degree to which schools possess the potential to become high performing learning communities. The SCA was part of AELs School Capacity Development project. The pilot version of the SCA was intended to be administered to K-12 professional staff to assist them…
Descriptors: Educational Change, Institutional Characteristics, Low Achievement, Pilot Projects
Lee, Yong-Won; Golub-Smith, Marna; Payton, Carmen; Carey, Jill – 2001
This study investigated the validity of the current reliability estimation procedure for the Test of Spoken English (TSE), a tape-mediated semi-performance test of 12 speaking tasks, from the perspective of generalizability theory and examined the feasibility of shortening the test without compromising the psychometric quality of the test. Data…
Descriptors: Adults, English (Second Language), Estimation (Mathematics), Generalizability Theory
Bastick, Tony – 2001
The research literature on student evaluation of teaching (SET) is filled with criticisms of the process, its applications, and the student feedback questionnaire it uses. SETs are still used, however, because there has seemed to be no economical, valid, and reliable alternative. This paper reports on an alternative alignment process for…
Descriptors: College Faculty, Criteria, Higher Education, Learning
Breyer, F. Jay; Lewis, Charles – 1994
A single-administration classification reliability index is described that estimates the probability of consistently classifying examinees to mastery or nonmastery states as if those examinees had been tested with two alternate forms. The procedure is applicable to any test used for classification purposes, subdividing that test into two…
Descriptors: Classification, Cutting Scores, Objective Tests, Pass Fail Grading
Ediger, Marlow – 2001
To assure the fair and honest grading of student achievement, validity and reliability are key to writing test items. Clarity in writing each item is essential. Multiple procedures of assessing the achievement of university students should be implemented, and instructors and professors should be held accountable for the fair and honest grading of…
Descriptors: Academic Achievement, College Students, Educational Technology, Grades (Scholastic)
Piburn, Michael; Sawada, Daiyo – 2000
The Reformed Teaching Observation Protocol (RTOP) was created by the Evaluation Group of the Arizona Collaborative for Excellence in the Preparation of Teachers (ACEPT) as an observational instrument designed to measure "reformed" teaching. This document is a guide to its use. The theoretical concepts that guided the development of the…
Descriptors: Classroom Observation Techniques, Educational Change, Elementary Secondary Education, Measures (Individuals)
Mertler, Craig A. – 1999
This study examined processes and techniques teachers used to ensure that their assessments were valid and reliable, noting the extent to which they engaged in these processes. A sample of 625 elementary and secondary teachers received mailed copies of the Ohio Teacher Assessment Practices Survey, which asked about steps that they followed and the…
Descriptors: Elementary Secondary Education, Evaluation Methods, Student Evaluation, Teacher Attitudes
Olsina, L; Rossi, G. – 1999
This paper identifies World Wide Web site characteristics and attributes and groups them in a hierarchy. The primary goal is to classify the elements that might be part of a quantitative evaluation and comparison process. In order to effectively select quality characteristics, different users' needs and behaviors are considered. Following an…
Descriptors: Classification, Comparative Analysis, Efficiency, Evaluation Criteria
Mayton, Daniel M., II; Richel, Timothy W.; Susnjic, Silvia; Majdanac, Maja – 2002
The Teenage Nonviolence Test (TNT) has previously been established as a generally reliable and valid measure of nonviolence in adolescents. This study examined the extent to which the TNT's reliability and validity could be extended to college students aged 18-22 years of age. Five of the six subscales of the TNT were found to be reliable. The…
Descriptors: Affective Measures, College Students, Concurrent Validity, Higher Education
Meehan, Merrill L.; Cowley, Kimberly S.; Wiersma, William; Orletsky, Sandra R.; Sattes, Beth D.; Walsh, Jackie A. – 2002
As part of its school improvement effort, AEL, a regional education laboratory, developed the Continuous School Improvement Questionnaire (AEL CSIQ). Staff from the AEL Quest schools program drafted a 65-item questionnaire to help measure and assess the efforts of the project team in their work with the 18 schools in the Quest network. These items…
Descriptors: Educational Improvement, Elementary Secondary Education, Measurement Techniques, Reliability
Wainer, Howard – 1994
This study examined the Law School Admission Test (LSAT) through the use of testlet methods to model its inherent, locally dependent structure. Precision, measured by reliability, and fairness, measured by the comparability of performance across all identified subgroups of examinees, were the focus of the study. The polytomous item response theory…
Descriptors: College Entrance Examinations, Item Response Theory, Reading Comprehension, Reading Tests


