Publication Date
| In 2026 | 0 |
| Since 2025 | 58 |
| Since 2022 (last 5 years) | 284 |
| Since 2017 (last 10 years) | 780 |
| Since 2007 (last 20 years) | 2042 |
Descriptor
| Interrater Reliability | 3124 |
| Foreign Countries | 655 |
| Test Reliability | 503 |
| Evaluation Methods | 502 |
| Test Validity | 410 |
| Correlation | 401 |
| Scoring | 347 |
| Comparative Analysis | 327 |
| Scores | 324 |
| Validity | 310 |
| Student Evaluation | 308 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 130 |
| Practitioners | 42 |
| Teachers | 22 |
| Administrators | 11 |
| Counselors | 3 |
| Policymakers | 2 |
Location
| Australia | 56 |
| Turkey | 53 |
| United Kingdom | 46 |
| Canada | 45 |
| Netherlands | 40 |
| China | 38 |
| California | 37 |
| United States | 30 |
| United Kingdom (England) | 25 |
| Taiwan | 23 |
| Germany | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 3 |
| Meets WWC Standards with or without Reservations | 3 |
| Does not meet standards | 3 |
Aleong, Chandra – Journal of College Teaching & Learning, 2007
This paper discusses whether there are differences in performance based on differences in strategy. First, an attempt was made to determine whether the institution had a strategy, and if so, did it follow a particular model. Major models of strategy are the industry analysis approach, the resource based view or the RBV model and the more recent,…
Descriptors: Strategic Planning, Higher Education, Institutional Evaluation, Case Studies
Bartels, Meike; Boomsma, Dorret I.; Hudziak, James J.; van Beijsterveldt, Toos C. E. M.; van den Oord, Edwin J. C. G. – Psychological Methods, 2007
Genetically informative data can be used to address fundamental questions concerning the measurement of behavior in children. The authors illustrate this with longitudinal multiple-rater data on internalizing problems in twins. Valid information on the behavior of a child is obtained for behavior that multiple raters agree upon and for…
Descriptors: Twins, Behavior Problems, Genetics, Error of Measurement
Hmelo-Silver, Cindy E.; Marathe, Surabhi; Liu, Lei – Journal of the Learning Sciences, 2007
Understanding complex systems is fundamental to understanding science. The complexity of such systems makes them very difficult to understand because they are composed of multiple interrelated levels that interact in dynamic ways. The goal of this study was to understand how experts and novices differed in their understanding of two complex…
Descriptors: Ecology, Anatomy, Physiology, Knowledge Representation
Olver, Mark E.; Wong, Stephen C. P.; Nicholaichuk, Terry; Gordon, Audrey – Psychological Assessment, 2007
The Violence Risk Scale-Sexual Offender version (VRS-SO) is a rating scale designed to assess risk and predict sexual recidivism, to measure and link treatment changes to sexual recidivism, and to inform the delivery of sexual offender treatment. The VRS-SO comprises 7 static and 17 dynamic items empirically or conceptually linked to sexual…
Descriptors: Validity, Rating Scales, Recidivism, Interrater Reliability
de Villiers, Jessica; Fine, Jonathan; Ginsberg, Gary; Vaccarella, Liezanne; Szatmari, Peter – Journal of Autism and Developmental Disorders, 2007
There are few well-standardized measures of conversational breakdown in Autism Spectrum Disorders (ASD). The study's objective was to develop a scale for measuring pragmatic impairments in conversations of individuals with ASD. We analyzed 46 semi-structured conversations of children and adolescents with high-functioning ASD using a functional…
Descriptors: Measures (Individuals), Speech Communication, Semantics, Pragmatics
Guskey, Thomas R. – Educational Measurement: Issues and Practice, 2007
This study compared different stakeholders' perceived validity of various indicators of student learning used to judge the quality of students' academic performance. Data were gathered from the questionnaire responses of 314 educators in three states that have implemented comprehensive state-wide assessment programs with high-stakes consequences…
Descriptors: Academic Achievement, Educational Indicators, State Surveys, Participation
Lu, Zhihong; Hou, Leijuan; Huang, Xiaohui – International Journal of Education and Development using Information and Communication Technology, 2010
The development and application of Information and Communication Technologies (ICT) in the field of Foreign Language Teaching (FLT) have had a considerable impact on the teaching methodologies in China. With an increasing emphasis on strengthening students' learning initiative and adopting a "student-centred" teaching concept in FLT,…
Descriptors: Foreign Countries, English (Second Language), Second Language Instruction, Second Language Learning
OECD Publishing (NJ1), 2009
The Organisation for Economic Cooperation and Development's (OECD's) Programme for International Student Assessment (PISA) surveys, which take place every three years, have been designed to collect information about 15-year-old students in participating countries. PISA examines how well students are prepared to meet the challenges of the future,…
Descriptors: Policy Formation, Scaling, Academic Achievement, Interrater Reliability
Horng, Eileen Lai; Klasik, Daniel; Loeb, Susanna – National Center for Analysis of Longitudinal Data in Education Research, 2009
School principals have complex jobs. To better understand the work lives of principals, this study uses observational time-use data for all high school principals in Miami-Dade County Public Schools. This paper examines the relationship between the time principals spent on different types of activities and school outcomes including student…
Descriptors: School Effectiveness, Principals, High Schools, Time Management
Lunz, Mary E.; O'Neill, Thomas R. – 1997
This retrospective longitudinal study was designed to show grading leniency patterns of judges within and across clinical examination administrations. Data from 17 different administrations of the histology examination of the American Society of Clinical Pathologists over 10 years were studied. Over the 10 years there were 4,683 candidates and 57…
Descriptors: Higher Education, Interrater Reliability, Item Response Theory, Judges
Towers, David A.; And Others – 1987
Metaphor has recently become a topic of considerable interest and empirical investigation. Cognitive theories view metaphor as an aid to cognitive restructuring. Beyond the heuristic and theoretical concepts, the results of empirical research have suggested that metaphor is a potentially valuable tool for enhancing client involvement in counseling…
Descriptors: Cognitive Restructuring, College Students, Counseling Techniques, Definitions
Steiner, Dirk D.; Rain, Jeffrey S. – 1988
Many empirical studies have examined factors that influence ratings of performance. This study examined the rating variable performance of a single individual. Serial position of a single poor or good performance in a series of otherwise good or poor performances was manipulated to examine its effects on both ratings and recommended actions toward…
Descriptors: Behavior Patterns, College Students, Higher Education, Interrater Reliability
Peer reviewedMcLeod, P. J. – Journal of Medical Education, 1987
A study of interrater reliability among 17 faculty members assessing medical student case reports revealed marked disparities in the criteria raters felt to be important and an unacceptable spread in the ratings given. A standardized assessment instrument is recommended instead. (MSE)
Descriptors: Higher Education, Interrater Reliability, Medical Case Histories, Medical Education
Peer reviewedBorich, Gary; Klinzing, Garhard – Journal of Classroom Interaction, 1984
Problems in studying teacher effectiveness through the use of classroom observation are discussed. Four assumptions in the observation of classroom process are offered and ways in which these assumptions can be dealt with in designing an observation study are suggested. (DF)
Descriptors: Classroom Observation Techniques, Error of Measurement, Experimenter Characteristics, Interrater Reliability
Peer reviewedNortham, Elizabeth; And Others – Merrill-Palmer Quarterly, 1987
Two studies concerned with agreement in ratings of temperament are reported. Ratings of the mothers of toddlers versus daycare workers were compared on the Toddler Temperament Scale (Study 1), and on ratings of a videotape of a 2-year-old child for responses relevant to six dimensions of temperament (Study 2). (Author/BN)
Descriptors: Affective Behavior, Behavior Rating Scales, Interrater Reliability, Mothers

Direct link
