Publication Date
| In 2026 | 3 |
| Since 2025 | 675 |
| Since 2022 (last 5 years) | 3176 |
| Since 2017 (last 10 years) | 7417 |
| Since 2007 (last 20 years) | 15055 |
Descriptor
| Test Reliability | 15043 |
| Test Validity | 10279 |
| Reliability | 9761 |
| Foreign Countries | 7144 |
| Test Construction | 4825 |
| Validity | 4191 |
| Measures (Individuals) | 3877 |
| Factor Analysis | 3825 |
| Psychometrics | 3526 |
| Interrater Reliability | 3124 |
| Correlation | 3040 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1328 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 253 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 217 |
| California | 215 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Peer reviewedSanders, Steven G. – Journal of College Science Teaching, 1975
Several techniques to use in evaluation and grading are presented. Some grading problems are discussed briefly. (PEB)
Descriptors: Error of Measurement, Evaluation, Evaluation Methods, Grading
Peer reviewedKohen, Andrew I.; Breinich, Susan C. – Journal of Vocational Behavior, 1975
The study evaluates a test of occupational information administered to a national sample of 5000 young men in 1966, as part of the National Longitudinal Surveys of employment behavior. The measurement instrument is judged to exhibit desirable characteristics in terms of internal consistency reliability, discriminatory power, and level of…
Descriptors: Career Choice, Career Counseling, Males, Measurement Instruments
Peer reviewedFaschingbauer, Thomas R. – Journal of Consulting and Clinical Psychology, 1974
The Faschingbauer Abbreviated Minnesota Multiphasic Personality Inventory (FAM) was developed using cluster analysis and was compared to the Minnesota Multiphasic Personality Inventory (MMPI) and other short forms. On code-type correspondence, configural classifications, profile validities, and scale elevations, the FAM compared favorably to a…
Descriptors: Comparative Analysis, Evaluation, Personality Measures, Psychological Testing
Partington, J. A. – Modern Languages, 1974
This is a discussion of the results of the Pimsleur Language Aptitude Battery used to determine the potential language-learning ability of secondary school students. (CK)
Descriptors: Aptitude, Aptitude Tests, Language Skills, Language Tests
Peer reviewedHead, Mary K.; And Others – Educational and Psychological Measurement, 1974
Presents details of construction and initial validation of a likert scale for assessing attitude toward six categories of school life with particular emphasis on school lunch. (Author/RC)
Descriptors: Elementary Secondary Education, Item Analysis, Lunch Programs, Rating Scales
Haberman, Shelby J. – ETS Research Report Series, 2005
In educational tests, subscores are often generated from a portion of the items in a larger test. Guidelines based on mean-squared error are proposed to indicate whether subscores are worth reporting. Alternatives considered are direct reports of subscores, estimates of subscores based on total score, combined estimates based on subscores and…
Descriptors: Scores, Test Items, Error of Measurement, Computation
Haberman, Shelby J.; Sinharay, Sadip; Puhan, Gautam – ETS Research Report Series, 2006
Recently, there has been an increasing level of interest in reporting subscores. This paper examines the issue of reporting subscores at an aggregate level, especially at the level of institutions that the examinees belong to. A series of statistical analyses is suggested to determine when subscores at the institutional level have any added value…
Descriptors: Scores, Statistical Analysis, Error of Measurement, Reliability
Bolton, Brian; Roessler, Richard – 1986
The manual introduces the Work Personality Profile (WPP), an observational work behavior rating instrument for use in situational assessment in work centers, comprehensive facilities, and employment settings. The WPP assesses such abilities as work attitudes, values, habits, and behaviors that are essential to achievement and maintenance of…
Descriptors: Adults, Behavior Rating Scales, Diagnostic Tests, Disabilities
Olivarez, Arturo, Jr.; And Others – 1990
The purposes of the present investigation were to illustrate the applicability of categorization methodology for several empirical situations and to draw implications regarding the use of such methodology in examining categorical data. In using three tasks--two designed to measure cognitive dimensions (e.g., categorizing countries and categorizing…
Descriptors: Classification, Cognitive Tests, Education Majors, Higher Education
Zwick, Rebecca – 1986
Most currently used measures of inter-rater agreement for the nominal case incorporate a correction for "chance agreement." The definition of chance agreement is not the same for all coefficients, however. Three chance-corrected coefficients are Cohen's Kappa; Scott's Pi; and the S index of Bennett, Goldstein, and Alpert, which has…
Descriptors: Error of Measurement, Interrater Reliability, Mathematical Models, Measurement Techniques
Yuker, Harold E.; Block, J. R. – 1986
The monograph provides a review of research studies over the past 25 years which have made use of the Attitude Toward Disabled Persons (ATDP) Scale. The report focuses on pertinent information about the scales, their psychometric properties, and the multitude of ways they have been used. An introductory chapter looks briefly at the history of the…
Descriptors: Attitude Change, Attitude Measures, Attitudes, Attitudes toward Disabilities
Fagot, Beverly I.; Hagan, Richard – 1985
Covert checks of observational methodology reveal declines in reliability of observations. This appears to be particularly true when complex codes are used to track social interaction. The present study was undertaken to see whether reliability could be maintained through a combination of technological advancements and the development of improved…
Descriptors: Automation, Classroom Observation Techniques, Data Collection, Reliability
David, Jane L. – 1985
Three goals must be met in order for the National Center for Education Statistics (NCES) to improve the quality and utility of its data collection: (1) the choice of what to collect must be driven by the questions of interest to decisionmakers and the public; (2) procedures must insure validity and reliability of the data; and (3) the data must be…
Descriptors: Data Collection, Data Interpretation, Educational Research, Elementary Secondary Education
Woodruff, David J.; Sawyer, Richard L. – 1988
Two methods for estimating measures of pass-fail reliability are derived, by which both theta and kappa may be estimated from a single test administration. The methods require only a single test administration and are computationally simple. Both are based on the Spearman-Brown formula for estimating stepped-up reliability. The non-distributional…
Descriptors: Estimation (Mathematics), Licensing Examinations (Professions), Pass Fail Grading, Scores
Stelmachers, Zigfrids T.; Sherman, Robert E. – 1988
The clinical usefulness of various empirically derived suicide potential rating scales has been questioned by several suicidologists. This study used actual case histories in an attempt to anchor suicide risk ratings. Thirty-three brief case histories of suicidal patients were given to 19 experienced crisis workers for seven-point ratings of…
Descriptors: Clinical Diagnosis, Evaluation Criteria, Evaluation Methods, High Risk Persons


