Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 10 |
Descriptor
| Evaluation Methods | 45 |
| Research Problems | 45 |
| Reliability | 27 |
| Research Methodology | 23 |
| Validity | 20 |
| Test Reliability | 13 |
| Educational Research | 11 |
| Measurement Techniques | 11 |
| Evaluation Criteria | 9 |
| Statistical Analysis | 9 |
| Evaluation Problems | 8 |
| More ▼ | |
Source
Author
| Gresham, Frank M. | 2 |
| Albert M. Jimenez | 1 |
| Allen, Patricia J. | 1 |
| Anderson, Lorin W. | 1 |
| Arthur, Michael | 1 |
| Barnes, Robert E. | 1 |
| Campbell, Heather E. | 1 |
| Carroll, Kathleen M. | 1 |
| Chang Xu | 1 |
| Chang, Rong | 1 |
| Cizek, Gregory J. | 1 |
| More ▼ | |
Publication Type
Education Level
| Elementary Secondary Education | 4 |
| Higher Education | 1 |
| Secondary Education | 1 |
Audience
| Researchers | 5 |
| Parents | 1 |
| Policymakers | 1 |
| Practitioners | 1 |
Location
| Canada | 2 |
| Australia | 1 |
| China | 1 |
| United Kingdom | 1 |
| United States | 1 |
Laws, Policies, & Programs
Assessments and Surveys
| National Assessment of… | 2 |
| Comprehensive Tests of Basic… | 1 |
| Personal Orientation Inventory | 1 |
| Program for International… | 1 |
| Progress in International… | 1 |
| Trends in International… | 1 |
What Works Clearinghouse Rating
Yuan Tian; Xi Yang; Suhail A. Doi; Luis Furuya-Kanamori; Lifeng Lin; Joey S. W. Kwong; Chang Xu – Research Synthesis Methods, 2024
RobotReviewer is a tool for automatically assessing the risk of bias in randomized controlled trials, but there is limited evidence of its reliability. We evaluated the agreement between RobotReviewer and humans regarding the risk of bias assessment based on 1955 randomized controlled trials. The risk of bias in these trials was assessed via two…
Descriptors: Risk, Randomized Controlled Trials, Classification, Robotics
Albert M. Jimenez; Sally J. Zepeda – Sage Research Methods Cases, 2017
The work presented in this case study results from a study conducted in 2012-2014 examining a newly created teacher evaluation system to determine the inter-rater reliability of the classroom observation instrument. The teacher evaluation system was the result of a partnership between the school district and the university in the same city…
Descriptors: Case Studies, Interrater Reliability, Teacher Evaluation, Observation
Little, Todd D.; Chang, Rong; Gorrall, Britt K.; Waggenspack, Luke; Fukuda, Eriko; Allen, Patricia J.; Noam, Gil G. – International Journal of Behavioral Development, 2020
We revisit the merits of the retrospective pretest-posttest (RPP) design for repeated-measures research. The underutilized RPP method asks respondents to rate survey items twice during the same posttest measurement occasion from two specific frames of reference: "now" and "then." Individuals first report their current attitudes…
Descriptors: Pretesting, Alternative Assessment, Program Evaluation, Evaluation Methods
Martin, Andrew J.; Yu, Kai; Papworth, Brad; Ginns, Paul; Collie, Rebecca J. – Journal of Psychoeducational Assessment, 2015
This study explored motivation and engagement among North American (the United States and Canada; n = 1,540), U.K. (n = 1,558), Australian (n = 2,283), and Chinese (n = 3,753) secondary school students. Motivation and engagement were assessed via students' responses to the Motivation and Engagement Scale-High School (MES-HS). Confirmatory factor…
Descriptors: Foreign Countries, Motivation, Learner Engagement, Secondary School Students
Dockray, Samantha; Grant, Nina; Stone, Arthur A.; Kahneman, Daniel; Wardle, Jane; Steptoe, Andrew – Social Indicators Research, 2010
Measurement of affective states in everyday life is of fundamental importance in many types of quality of life, health, and psychological research. Ecological momentary assessment (EMA) is the recognized method of choice, but the respondent burden can be high. The day reconstruction method (DRM) was developed by Kahneman and colleagues ("Science,"…
Descriptors: Employed Women, Quality of Life, Evaluation Methods, Psychological Patterns
Erceg-Hurn, David M.; Mirosevich, Vikki M. – American Psychologist, 2008
Classic parametric statistical significance tests, such as analysis of variance and least squares regression, are widely used by researchers in many disciplines, including psychology. For classic parametric tests to produce accurate results, the assumptions underlying them (e.g., normality and homoscedasticity) must be satisfied. These assumptions…
Descriptors: Statistical Significance, Least Squares Statistics, Effect Size, Statistical Studies
Hagermoser Sanetti, Lisa M.; Kratochwill, Thomas R. – School Psychology Review, 2009
Treatment integrity (also referred to as "treatment fidelity," "intervention integrity," and "procedural reliability") is an important methodological concerning both research and practice because treatment integrity data are essential to making valid conclusions regarding treatment outcomes. Despite its relationship to validity, treatment…
Descriptors: Intervention, Research Methodology, Models, Validity
Peer reviewedKreiman, Jody; And Others – Journal of Speech and Hearing Research, 1992
Sixteen listeners (10 expert, 6 naive) judged the dissimilarity of pairs of voices drawn from pathological and normal populations. Only parameters that showed substantial variability were perceptually salient across listeners. Results suggest that traditional means of assessing listener reliability in voice perception tasks may not be appropriate.…
Descriptors: Evaluation Methods, Individual Differences, Interrater Reliability, Perception
McLeod, Bryce D.; Southam-Gerow, Michael A.; Weisz, John R. – School Psychology Review, 2009
This special series focused on treatment integrity in the child mental health and education field is timely. The articles do a laudable job of reviewing (a) the current status of treatment integrity research and measurement, (b) existing conceptual models of treatment integrity, and (c) the limitations of prior research. Overall, this thoughtful…
Descriptors: Evaluation Research, Children, Intervention, Research Methodology
Peer reviewedTaylor, Erwin K.; Griess, Thomas – Personnel Psychology, 1976
In most selection validation research, only the upper and lower tails of the criterion distribution are used, often yielding misleading or incorrect results. Provides formulas and tables which enable the researcher to account more accurately for the distribution of criterion within the middle range of population. (Author/RW)
Descriptors: Evaluation Methods, Measurement Techniques, Predictive Validity, Reliability
Peer reviewedFlack, Virginia F.; And Others – Psychometrika, 1988
A method is presented for determining sample size that will achieve a pre-specified bound on confidence interval width for the interrater agreement measure "kappa." The same results can be used when a pre-specified power is desired for testing hypotheses about the value of kappa. (Author/SLD)
Descriptors: Evaluation Methods, Interrater Reliability, Research Methodology, Research Problems
Hedge, Jerry W.; Laue, Frances J. – 1988
The ability of individuals to make accurate judgments about others is examined and literature on this subject is reviewed. A wide variety of situational factors affects the appraisal of performance. It is generally accepted that the purpose of the appraisal influences the accuracy of the appraiser. The instrumentation, or tools, available to the…
Descriptors: Evaluation Criteria, Evaluation Methods, Evaluation Problems, Performance Factors
Peer reviewedFurlong, Michael J.; Wampold, Bruce E. – Psychology in the Schools, 1981
To guide the unbiased process of visual inference, a four-step model is presented for the assessment of reliability, intervention effect, meaningfulness, and generalizability. A Visual Inference Checklist (VIC) systematizes this assessment process. (Author)
Descriptors: Bias, Data Analysis, Evaluation Methods, Identification
Peer reviewedKolevzon, Michael S.; And Others – Journal of Marital and Family Therapy, 1988
Employed triangulation strategy for assessing family interaction, involving family members, therapist, and coders independently viewing videotapes. Found weak agreement between paired assessments within family triad, and within therapist-coder dyad. Findings suggest that methodological and/or scaling strategies designed to maximize agreement may…
Descriptors: Counselor Attitudes, Evaluation Criteria, Evaluation Methods, Evaluation Problems
Peer reviewedCarroll, Kathleen M. – Psychological Assessment, 1995
Three types of methodological issues particularly salient in research involving the assessment of substance use or abuse are discussed with strategies for avoiding problems: (1) the reliability and validity of methods; (2) the variability and episodic course of substance use; and (3) the heterogeneity of individuals with substance use disorders.…
Descriptors: Clinical Diagnosis, Evaluation Methods, Psychological Studies, Reliability

Direct link
