Publication Date
| In 2026 | 3 |
| Since 2025 | 636 |
| Since 2022 (last 5 years) | 3137 |
| Since 2017 (last 10 years) | 7378 |
| Since 2007 (last 20 years) | 15016 |
Descriptor
| Test Reliability | 15015 |
| Test Validity | 10252 |
| Reliability | 9751 |
| Foreign Countries | 7126 |
| Test Construction | 4811 |
| Validity | 4189 |
| Measures (Individuals) | 3875 |
| Factor Analysis | 3821 |
| Psychometrics | 3515 |
| Interrater Reliability | 3122 |
| Correlation | 3037 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1320 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 251 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 216 |
| California | 214 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Peer reviewedTyson, LeaAnn; Silverman, Stephen – Journal of Personnel Evaluation in Education, 1994
Differences in the Texas Teacher Appraisal System scores of teacher subgroups over 2 years were examined for 2,366 teachers for scores on individual domains, sums of scores of the 1st 4 domains, and overall summary performance scores, as well as appraiser differences. Implications for teacher evaluation are discussed. (SLD)
Descriptors: Educational Assessment, Elementary Secondary Education, Evaluation Methods, Evaluators
Peer reviewedGross, Leon J. – Evaluation and the Health Professions, 1994
Whether adequate levels of interrater reliability could be obtained on a national, standardized examination using one examiner per observation was studied with 101 paired candidate observations on an examination for optometry. Results indicate that psychometrically sound judgments can be obtained with one examiner. (SLD)
Descriptors: Educational Assessment, Error of Measurement, Evaluation Methods, Evaluators
Peer reviewedGoodwin, David A. J.; And Others – Psychological Assessment, 1994
Development of a parent report measure for assessing the quality of life of children with cancer is described. The Pediatric Oncology Quality of Life Scale assesses physical function and role restriction, emotional distress, and reaction to current medical treatment. Reliability and validity assessments provide preliminary support for the…
Descriptors: Cancer, Children, Emotional Problems, Evaluation Methods
Peer reviewedAdams, Cheryll M.; Callahan, Carolyn M. – Gifted Child Quarterly, 1995
The Diet Cola Test was designed as a process assessment of science aptitude in intermediate grade students. Investigations of the instrument's reliability and validity indicated that data did not support use of the instrument for identifying individual students' aptitude. However, results suggested the test's appropriateness for evaluating…
Descriptors: Academically Gifted, Aptitude Tests, Cognitive Processes, Decision Making
Microcomputers for Information Management, 1995
Discusses the National Information Infrastructure and the role of the government. Topics include private sector investment; universal service; technological innovation; user orientation; information security and network reliability; management of the radio frequency spectrum; intellectual property rights; coordination with other levels of…
Descriptors: Access to Information, Computer Networks, Government Role, Information Networks
Peer reviewedWigglesworth, Gillian – Australian Review of Applied Linguistics, 1994
Multifaceted Rasch analysis was used to determine whether bias was evident in the way a group of raters graded two different versions of an oral interaction test, undertaken by the same candidates. Results indicate that certain raters consistently rated the tape version of the test more harshly while others rated the live one more harshly. (10…
Descriptors: Data Collection, Foreign Countries, Graphs, Interaction Process Analysis
Bushweller, Kevin – Executive Educator, 1995
Describes a rural Vermont K-12 school's experimentation with electronic portfolio assessment. Although electronic portfolios are clearly superior to paper portfolios in evaluating young readers, problems can arise concerning assessment reliability, missing files, student forgetfulness, passwords, and crashed systems. Teachers value this technology…
Descriptors: Computer Uses in Education, Educational Benefits, Elementary Secondary Education, Evaluation Methods
Peer reviewedKlein, Stephen P.; And Others – Applied Measurement in Education, 1995
Portfolios are the centerpiece of Vermont's statewide assessment program in mathematics. Portfolio scores in the first two years were not reliable enough to permit the reporting of student-level results, but increasing the number of readers or the number of portfolio pieces is not operationally feasible. (SLD)
Descriptors: Educational Assessment, Elementary Secondary Education, Mathematics Tests, Performance Based Assessment
Peer reviewedBrooke, Stephanie L. – Measurement and Evaluation in Counseling and Development, 1995
Provides evaluation of Cliffs' GRE StudyWare package (Bobrow, 1992). Discusses the educational implications of using Cliffs' approach, in addition to focusing on software considerations. Makes recommendations concerning Cliffs' method for Graduate Record Examination (GRE) preparation. (Author/LKS)
Descriptors: Achievement Tests, Computer Assisted Instruction, Computer Software Reviews, Computer Uses in Education
Peer reviewedEinfeld, Stewart L.; Tonge, Bruce J. – Journal of Autism and Developmental Disorders, 1995
This article describes the development and validation of the Developmental Behavior Checklist for children with emotional and behavior problems along with mental retardation. The article discusses generating and refining the checklist items, results of a principal components analysis, establishing reliability and construct and criterion validity,…
Descriptors: Behavior Development, Behavior Disorders, Behavior Rating Scales, Check Lists
Peer reviewedRusson, Craig; Koehly, Laura M. – Evaluation and Program Planning, 1995
A scale was developed for measuring the persuasive impact of qualitative and quantitative evaluation reports on decision makers. Using two exploratory (n=192 graduate and undergraduate students) and two confirmatory (n=200 administrators) samples, researchers developed a 28-item Likert-type scale that demonstrated high reliability and validity.…
Descriptors: Administrators, Attention, College Students, Comprehension
Peer reviewedLinn, Robert L.; Kiplinger, Vonda L. – Applied Measurement in Education, 1995
The adequacy of linking statewide standardized test results to the National Assessment of Educational Progress by using equipercentile equating procedures was investigated using statewide mathematics data from four states. Results suggest that the linkings are not sufficiently trustworthy to make comparisons based on the tails of the distribution.…
Descriptors: Comparative Analysis, Educational Assessment, Equated Scores, Mathematics Tests
Peer reviewedFrisbie, David A. – Educational Measurement: Issues and Practice, 1992
Literature related to the multiple true-false (MTF) item format is reviewed. Each answer cluster of a MTF item may have several true items and the correctness of each is judged independently. MTF tests appear efficient and reliable, although they are a bit harder than multiple choice items for examinees. (SLD)
Descriptors: Achievement Tests, Difficulty Level, Literature Reviews, Multiple Choice Tests
Hoover, John H.; And Others – Education and Training in Mental Retardation, 1992
The development of a structured interview designed to assess leisure satisfaction in persons with mental retardation is described along with initial reliability, validity, and leisure satisfaction findings with 40 individuals with developmental disabilities. Also considered are the rationale for measuring leisure satisfaction based on quality of…
Descriptors: Adolescents, Adults, Interviews, Leisure Time
Peer reviewedSmith, Dwight L. – Journal of Higher Education, 1992
A study analyzed validity and reliability of grades and credits earned by college students in five departments, as indicators of student learning. Results indicate positive, strong correlation between faculty-assigned grades and student performance on external criterion measures. Validity of credits was not as clear. Strong and consistent evidence…
Descriptors: Academic Achievement, College Credits, College Faculty, Comparative Analysis


