Publication Date
| In 2026 | 0 |
| Since 2025 | 58 |
| Since 2022 (last 5 years) | 284 |
| Since 2017 (last 10 years) | 780 |
| Since 2007 (last 20 years) | 2042 |
Descriptor
| Interrater Reliability | 3124 |
| Foreign Countries | 655 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 25 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Sorcinelli, Andrea; Shaw, Lynn; Freeman, Andrew; Cooper, Kim – Canadian Journal on Aging, 2007
Purpose: The purpose of this study was to evaluate the utility and reliability of a home hazard checklist published in Health Canada, "The Safe Living Guide: A Guide to Home Safety for Seniors" (2003). Methods: 76 community-dwelling seniors evaluated the guide, and inter-rater reliability was determined through comparison of ratings of…
Descriptors: Foreign Countries, Check Lists, Caregivers, Independent Living
Cordier, Deborah – ProQuest LLC, 2009
A renewed focus on foreign language (FL) learning and speech for communication has resulted in computer-assisted language learning (CALL) software developed with Automatic Speech Recognition (ASR). ASR features for FL pronunciation (Lafford, 2004) are functional components of CALL designs used for FL teaching and learning. The ASR features…
Descriptors: Feedback (Response), Computer Assisted Instruction, Validity, Computer Software
Garside, Sarah; Levinson, Anthony; Kuziora, Sophie; Bay, Michael; Norman, Geoffrey – Electronic Journal of e-Learning, 2009
Background: Every physician in Ontario needs to know how to fill out a Form 1 in order to legally hold a person against their will for a psychiatric assessment. These forms are frequently inaccurately filled out, which could constitute wrongful confinement and, in extreme circumstances, could lead to fines as large as $25,000. Training people to…
Descriptors: Electronic Learning, Medical Education, Medical Students, Intervention
A Multi-Component Model for Assessing Learning Objects: The Learning Object Evaluation Metric (LOEM)
Kay, Robin H.; Knaack, Liesel – Australasian Journal of Educational Technology, 2008
While discussion of the criteria needed to assess learning objects has been extensive, a formal, systematic model for evaluation has yet to be thoroughly tested. The purpose of the following study was to develop and assess a multi-component model for evaluating learning objects. The Learning Object Evaluation Metric (LOEM) was developed from a…
Descriptors: Foreign Countries, Models, Measurement Techniques, Evaluation Criteria
Carr, Edward G.; Ladd, Mara V.; Schulte, Christine F. – Journal of Positive Behavior Interventions, 2008
Problem behavior is a major barrier to successful community integration for people with developmental disabilities. Recently, there has been increased interest in identifying contextual factors involving setting events and discriminative stimuli that impact the display of problem behavior. The authors previously developed the "Contextual…
Descriptors: Behavior Problems, Intervention, Developmental Disabilities, Predictive Validity
Schaefer, Edward – Language Testing, 2008
The present study employed multi-faceted Rasch measurement (MFRM) to explore the rater bias patterns of native English-speaker (NES) raters when they rate EFL essays. Forty NES raters rated 40 essays written by female Japanese university students on a single topic adapted from the TOEFL Test of Written English (TWE). The essays were assessed using…
Descriptors: Writing Evaluation, Writing Tests, Program Effectiveness, Essays
Pennington, Rebecca E. – ProQuest LLC, 2010
This study was designed to determine whether teacher portfolios can be validly and reliably assessed, to investigate the effect of an instructional tool on increasing the level of reflective thinking in elementary preservice teachers' portfolios, and to find whether electronic portfolios designed and assessed in optimal conditions represent…
Descriptors: Portfolios (Background Materials), Control Groups, Preservice Teachers, Intervention
Evmenova, Anna S.; Graff, Heidi J.; Jerome, Marci Kinas; Behrmann, Michael M. – Learning Disabilities Research & Practice, 2010
This investigation examined the effects of currently available word prediction software programs that support phonetic/inventive spelling on the quality of journal writing by six students with severe writing and/or spelling difficulties in grades three through six during a month-long summer writing program. A changing conditions single-subject…
Descriptors: Writing Difficulties, Journal Writing, Computer Software Evaluation, Phonetics
Messana, Susan Melissa – 1984
Coding systems have become popular methods of cataloging the verbal and nonverbal interaction occurring during marital and family therapy. One such system, Pinsof's (1981) Family Therapist Coding System (FTCS), was the first designed explicitly to identify and differentiate specific verbal behaviors of family therapists independent of their…
Descriptors: Counselors, Interaction Process Analysis, Interrater Reliability, Marriage Counseling
Peer reviewedChartrand, Judy M.; And Others – Journal of Vocational Behavior, 1987
Drew from the occupational prestige literature in describing the development of prestige estimates for occupations contained in the Minnesota Occupational Classification System III (MOCS III). Results provided estimates for 60 occupations and a comparison of prestige scores for eight benchmark occupations. Devised equation for estimating…
Descriptors: Career Counseling, College Students, Interrater Reliability, Occupations
Peer reviewedSerlin, Ronald C.; Marascuilo, Leonard A. – Journal of Educational Statistics, 1983
Two alternatives to the problems of conducting planned and post hoc comparisons in tests of concordance and discordance for G groups of judges are examined. The two models are illustrated using existing data. (Author/JKS)
Descriptors: Attitude Measures, Comparative Analysis, Interrater Reliability, Mathematical Models
Peer reviewedHarvey, Robert J.; Hayes, Theodore L. – Personnel Psychology, 1986
Showed that reliabilities in the .50 range can be obtained when raters rule out only 15-20% of the items on the Position Analysis Questionnaire as "Does Not Apply" and respond randomly to the remainder. (Author/ABB)
Descriptors: Interrater Reliability, Job Analysis, Monte Carlo Methods, Occupational Information
Peer reviewedConger, Rand D.; And Others – Journal of Marriage and the Family, 1986
Examined the comparability of three techniques that are used to assess the dependability of family observational measures: analyses of observer agreement, reliability, and generalizability. Results indicated no single evaluative technique will always be most conservative in estimating the quality of observations. Suggests that multiple assessments…
Descriptors: Family Involvement, Generalization, Interrater Reliability, Measurement Techniques
Peer reviewedO'Sullivan, Sean; And Others – Journal of Marital and Family Therapy, 1984
Explores the reliability of the categories used to describe family structure in structural family therapy. Five clinicians independently rated three initial conjoint family interviews. Results are discussed in terms of their demonstration of the utility of the structural nonmenclature, some conceptual problems in the structural nomenclature, and…
Descriptors: Cocounseling, Family Counseling, Family Problems, Family Structure
Peer reviewedMorris, Woodrow W.; Boutelle, Sandra – Gerontologist, 1985
Examines the feasibility of making multidimensional functional assessments among 22 older persons by using a questionnaire. Analysis of ratings and objective scores suggests that among relatively independent, well elderly individuals, self-administered assessment should be the mode of choice. Clinical and survey research applications are…
Descriptors: Interrater Reliability, Older Adults, Research Methodology, Scoring

Direct link
