Publication Date
| In 2026 | 3 |
| Since 2025 | 675 |
| Since 2022 (last 5 years) | 3176 |
| Since 2017 (last 10 years) | 7417 |
| Since 2007 (last 20 years) | 15055 |
Descriptor
| Test Reliability | 15043 |
| Test Validity | 10279 |
| Reliability | 9761 |
| Foreign Countries | 7144 |
| Test Construction | 4825 |
| Validity | 4191 |
| Measures (Individuals) | 3877 |
| Factor Analysis | 3825 |
| Psychometrics | 3526 |
| Interrater Reliability | 3124 |
| Correlation | 3040 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1328 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 253 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 217 |
| California | 215 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
A Zero-One Programming Approach to Gulliksen's Matched Random Subtests Method. Research Report 86-4.
van der Linden, Wim J.; Boekkooi-Timminga, Ellen – 1986
In order to estimate the classical coefficient of test reliability, parallel measurements are needed. H. Gulliksen's matched random subtests method, which is a graphical method for splitting a test into parallel test halves, has practical relevance because it maximizes the alpha coefficient as a lower bound of the classical test reliability…
Descriptors: Algorithms, Computer Assisted Testing, Computer Software, Difficulty Level
Ferguson, Harold L.; Enger, John M. – 1985
The purpose of this study was to: (1) assess the anticipated ratings of teacher performance by principals using the Missouri Performance Based Teacher Evaluation (PBTE) prior to the first cycle of its implementation; (2) determine whether or not elementary and secondary principals, using the same instrument, would be consistent in perceived…
Descriptors: Competence, Elementary Secondary Education, Interrater Reliability, Job Performance
Speth, Carol A.; Plake, Barbara S. – 1985
While earlier, more blatant forms of sex discrimination may have declined, some researchers have suggested the existence of more subtle forms of bias, based less on gender than on gender-related attributes. The investigation of bias related to either gender or gender-related attributes requires a scale to address both the gender-relatedness of…
Descriptors: Attribution Theory, College Students, Employment Potential, Higher Education
Owston, Ronald D.; Dudley-Marling, Curt – 1986
The overall poor quality of educational software on the market suggests that educators must continue efforts to evaluate available packages and to disseminate their findings. In this paper, weaknesses in published evaluation procedures are identified, and an alternative model, the York Educational Software Evaluation Scale (YESES), is described.…
Descriptors: Computer Software, Correlation, Elementary Secondary Education, Evaluation Criteria
Bricker, Diane; Bailey, Earletta – 1983
The study examined psychometric properties of the Comprehensive Early Evaluation and Programming System (CEEPS), a criterion-referenced instrument designed for handicapped children birth to 3 years old. The instrument was intended to provide specific information to develop program objectives across a range of developmental areas and to assess…
Descriptors: Criterion Referenced Tests, Disabilities, Early Childhood Education, Evaluation Methods
McCarthy, Jean – 1987
The fundamental purposes of this study were to develop mastery tests in the cognitive and psychomotor domains for skin and scuba diving and to establish validity and reliability for the tests. A table of specifications was developed for each domain, and a pilot study refined the initial test batteries into their final form. In the main study,…
Descriptors: Cutting Scores, Higher Education, Knowledge Level, Mastery Tests
Lucas, Margaretha S.; Epperson, Douglas L. – 1986
Many studies which have investigated the differences between decided and undecided subjects have assumed homogeneity of both subsets, but results of these studies do not justify such a assumption. This study attempted to identify, multidimensionally, types of vocationally undecided college students. Data on 11 variables from 276 undecided…
Descriptors: Career Choice, Cluster Analysis, College Students, Decision Making
Atkinson, Dianne; Murray, Mary – 1987
Noting that improvement in rater reliability means eliminating differences among raters, this paper discusses ways to assess writing evaluator reliability and methods for achieving higher levels of interrater reliability. After showing that reliability can be improved two ways--by increasing the number of raters or measurements made, and by…
Descriptors: Evaluation Methods, Holistic Evaluation, Interrater Reliability, Measurement Techniques
Hogan, Thomas P.; Mishler, Carol – 1982
This literature review summarizes what is currently known about the agreement among six measures of writing skills. Three of these methods involve the application of human judgment in scoring or rating a piece of writing: holistic, analytical, and primary trait scoring. Two methods involve anatomical or taxonomic analysis of a piece of writing:…
Descriptors: Comparative Testing, Criterion Referenced Tests, Measurement Techniques, Scoring
Arndt, Stephan – 1981
The problem of change scores' correlation with initial status and the problem of low reliability in the measurement of change are addressed. By treating the correlation between initial status and change as a design problem rather than a statistical issue, research questions can be formulated in terms of changes in the shapes of growth curves…
Descriptors: Achievement Gains, Analysis of Covariance, Change, Correlation
Psychological Corp., New York, NY. – 1981
The Certificate in Data Processing (CDP) Examination conducted by the Institute for Certification of Computer Professionals (ICCP) is one of the qualifications for the CDP. The May 1981 administration tested 3,601 candidates at 149 international test sites. Half of the candidates were taking the examination for the first time and were taking all…
Descriptors: Administration, Certification, Computer Science, Data Processing
Zimmerman, Irla L.; Woo-Sam, James M. – 1982
Two kinds of WISC-R short forms, item reduction and subtest reduction, are reviewed in terms of their ability to meet these criteria of adequacy: a significant correlation between the full scale IQ and the short form IQ, a non-significant difference between the full and short form mean IQ, a low percentage of IQ classification changes resulting…
Descriptors: Intelligence Tests, Test Interpretation, Test Items, Test Reliability
Moy, Raymond – 1982
Score equating requires that the forms to be equated are functionally parallel. That is, the two test forms should rank order examinees in a similar fashion. In language proficiency testing situations, this assumption is often put into doubt because of the numerous tests that have been proposed as measures of language proficiency and the…
Descriptors: Equated Scores, Language Proficiency, Language Tests, Latent Trait Theory
Kelley, Kathryn – 1985
Self-destructiveness can be viewed in two ways: as performing an act which one knows cognitively is not conducive to one's welfare but nonetheless leads to some pleasurable affect (e.g., overeating, smoking); or not performing an act one knows one should perform but which has some negative affective consequences (e.g., dental checkups, saving…
Descriptors: Adults, Affective Behavior, Behavior Patterns, Locus of Control
Smith, John K.; Heshusius, Lous – 1985
Educational researchers have claimed that the quantitative and qualitative approaches to educational inquiry are, indeed, compatible. However, it would be unfortunate to discontinue this debate. The quantitative-qualitative debate began with the interpretive approach to social inquiry. Dilthey argued that since cultural/moral sciences differ from…
Descriptors: Educational Research, Educational Researchers, Experimenter Characteristics, Literature Reviews


