Publication Date
| In 2026 | 7 |
| Since 2025 | 690 |
| Since 2022 (last 5 years) | 3191 |
| Since 2017 (last 10 years) | 7432 |
| Since 2007 (last 20 years) | 15070 |
Descriptor
| Test Reliability | 15055 |
| Test Validity | 10290 |
| Reliability | 9763 |
| Foreign Countries | 7150 |
| Test Construction | 4828 |
| Validity | 4192 |
| Measures (Individuals) | 3880 |
| Factor Analysis | 3826 |
| Psychometrics | 3532 |
| Interrater Reliability | 3126 |
| Correlation | 3040 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1329 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 253 |
| Taiwan | 234 |
| Netherlands | 224 |
| Spain | 218 |
| California | 215 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Peer reviewedFeldt, Leonard S.; Qualls, Audrey L. – Applied Measurement in Education, 1998
Two relatively simple methods for estimating the condition standard error of measurement (SEM) for nonlinearly derived score scales are proposed. Applications indicate that these two procedures produce fairly consistent estimates that tend to peak near the high end of the scale and reach a minimum in the middle of the raw score scale. (SLD)
Descriptors: Error of Measurement, Estimation (Mathematics), Raw Scores, Reliability
Peer reviewedKlein, Sheryl; Magill-Evans, Joyce – Canadian Journal of Occupational Therapy, 1998
In a sample of 24 children with motor/language delays, the Pictorial Scale of Perceived Competence and Social Acceptance for Young Children (PS) and All about Me (AAM) had moderate to good reliability in measuring self-perceptions of competence. PS subscales other than cognitive competence and competence factor had lower reliability. (SK)
Descriptors: Childhood Attitudes, Competence, Self Concept, Test Reliability
Peer reviewedArnault, E. Jane; Gordon, Louis; Joines, Douglas H.; Phillips, G. Michael – Industrial and Labor Relations Review, 2001
Three commercial job evaluation firms rated the same set of 27 jobs. Statistical analysis indicated that evaluators differed in which job traits they used to evaluate inherent job worth. Comparable worth may thus be sensitive to the choice of evaluator. (Contains 24 references.) (Author/SK)
Descriptors: Comparable Worth, Evaluation Problems, Evaluators, Interrater Reliability
Peer reviewedPatterson, Brian R.; And Others – Western Journal of Communication, 1996
States that conversation analysis has enjoyed recent acceptance in mainstream communication research. Points out that one criticism is that conversation analysts have not felt obligated to demonstrate "intertranscriber" reliability for the use of transcription notation. Finds that multiple transcribers are capable of producing similar…
Descriptors: College Students, Communication Research, Higher Education, Reliability
Peer reviewedPritchard, David A.; Livingston, Ronald B.; Reynolds, Cecil R.; Moses, James A., Jr. – School Psychology Quarterly, 2000
Presents a normative typology for classifying the Wechsler Intelligence Scale for Children-Third Edition (WISC-III) factor index profiles according to profile shape. Current analyses indicate that overall profile level accounted for a majority of the variance in WISC-III index scores, but a considerable proportion of the variance was because of…
Descriptors: Children, Classification, Profiles, Psychological Testing
Peer reviewedSimco, Greg – Internet and Higher Education, 2001
Discussion of middleware focuses on the Internet 2 Middleware Initiative. Topics include a description of middleware as a layer of software in a distributed system; middleware characteristics, including transparency, portability, reliability, scalability, and interoperability; and the Internet 2 Middleware Initiative which is focused on research…
Descriptors: Computer Software, Computer Software Development, Reliability, Research and Development
Peer reviewedJones, Terry; Cason, Carolyn L.; Mancini, Mary E. – Journal of Professional Nursing, 2002
Registered nurses (n=368) participated in a skills recredentialing program in which competencies were assessed by a knowledge test and performance test under simulated conditions and evaluator ratings in actual patient-care situations. No significant differences in results between the simulated and actual conditions support the validity of the…
Descriptors: Competence, Credentials, Interrater Reliability, Nurses
Peer reviewedAucone, Ernest J.; Raphael, Alan J.; Golden, Charles J.; Espe-Pfeifer, Patricia; Seldon, Jen; Pospisil, Tanya; Dornheim, Liane; Proctor-Weber, Zoe; Calabria, Michael – Assessment, 1999
Assessed the interrater reliability of the revised Advanced Psychodiagnostic Interpretation (API) (A. Raphael and C. Golden, 1998) scoring system for the Bender Gestalt Test (L. Bender, 1938). Agreement across nine raters exceeded 90% for each of three clinical protocols, and kappa statistics indicated good interrater reliability. (SLD)
Descriptors: Diagnostic Tests, Interrater Reliability, Psychological Testing, Scoring
Peer reviewedNijhof, Wim J.; Jager, Anne – International Journal of Training and Development, 1999
Reliability and validity of scores from management seminars using a multirater feedback system were tested. Ratings of participants affected interrater reliability negatively, suggesting that formative and summative stages be held separately. (SK)
Descriptors: Feedback, Interrater Reliability, Management Development, Psychometrics
Peer reviewedMoskal, Barbara M.; Leydens, Jon A. – Practical Assessment, Research & Evaluation, 2000
Provides clear definitions of the terms "validity" and "reliability" in the context of developing scoring rubrics and illustrates these definitions through examples. Also clarifies how validity and reliability may be addressed in the development of scoring rubrics, defined as descriptive scoring schemes developed to guide the analysis of the…
Descriptors: Grading, Reliability, Scoring Rubrics, Student Evaluation
Peer reviewedPlake, Barbara S.; Impara, James C. – Educational Assessment, 2001
Examined the reliability and accuracy of item performance estimates from an Angoff standard setting application with 29 panelists on 1 year and 30 in the next year. Results provide evidence that item performance estimates were both reasonable and reliable. Discusses factors that might have influenced the results. (SLD)
Descriptors: Estimation (Mathematics), Evaluators, Performance Factors, Reliability
Peer reviewedHenson, Robin K.; Kogan, Lori R.; Vacha-Haase, Tammi – Educational and Psychological Measurement, 2001
Studied sources of measurement error variance in the Teacher Efficacy Scale (TES) (Gibson and Dembo, 1984). Used reliability generalization to characterize the typical score reliability for the TES and potential sources of measurement error variance across 43 studies. Also examined related instruments for measurement integrity. (SLD)
Descriptors: Error of Measurement, Generalization, Meta Analysis, Psychometrics
Peer reviewedBronson, Margaret Rogers; Bundy, Anita C. – Occupational Therapy Journal of Research, 2001
A study compared correlation coefficients for a sample of 109 children with disabilities with those of 51 typically developing children. A positive, significant correlation was found between playfulness and environmental supportiveness. The magnitude of the relationship was greater for typically developing children than for those with…
Descriptors: Child Development, Children, Disabilities, Environment
Peer reviewedHansson, Bo – Journal of European Industrial Training, 2001
Three studies assessed the accuracy of individuals' self-estimates of competence, using a concept of relative competence to account for rater variations. Results indicate that self-estimates can be accurate and the individual's perception of how important specific competencies are to the performance of certain jobs should be taken into account.…
Descriptors: Competence, Interrater Reliability, Job Skills, Models
Kabel, Suzanne; De Hoog, Robert; Wielinga, Bob; Anjewierden, Anjo – Journal of Educational Multimedia and Hypermedia, 2004
In addition to the LOM standard and instructional design specifications, as well as domain specific indexing vocabularies, a structured indexing vocabulary for the more elementary learning objects is advisable in order to support retrieval tasks of developers. Furthermore, because semantic indexing is seen as a difficult task, three issues…
Descriptors: Indexing, Instructional Materials, Reliability, Vocabulary

Direct link
