Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 9 |
Descriptor
Evaluation Problems | 13 |
Evaluation Research | 13 |
Test Reliability | 13 |
Test Validity | 11 |
Evaluation Methods | 9 |
Robustness (Statistics) | 4 |
Item Analysis | 3 |
Outcome Measures | 3 |
Performance Factors | 3 |
Standardized Tests | 3 |
Student Evaluation | 3 |
More ▼ |
Source
Author
Anderson, Andrew | 1 |
Athanasou, James A. | 1 |
Booker, Kevin | 1 |
Bowman, Nicholas A. | 1 |
Camilli, Gregory | 1 |
Cheng, Britte H. | 1 |
Colker, Alexis M. | 1 |
DeBarger, Angela | 1 |
Fawkes, Don | 1 |
Flage, Dan | 1 |
Gill, Brian | 1 |
More ▼ |
Publication Type
Journal Articles | 12 |
Reports - Evaluative | 9 |
Reports - Descriptive | 2 |
Reports - Research | 2 |
Education Level
Elementary Secondary Education | 5 |
Higher Education | 2 |
Postsecondary Education | 2 |
Adult Education | 1 |
Elementary Education | 1 |
High Schools | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Secondary Education | 1 |
Audience
Administrators | 1 |
Practitioners | 1 |
Location
California | 1 |
Florida | 1 |
North Carolina | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Huang, Xiaoping; Hu, Zhongfeng – Higher Education Studies, 2015
The main problem of the educational evaluation validity is that it just copies the conceptual framework system of validity from educational measurement to its own conceptual system. The validity conceptual system that fits the need of theory and practice of educational evaluation has not been established yet. According to the inherent attributive…
Descriptors: Test Validity, Educational Assessment, Evaluation Problems, Theory Practice Relationship
Warring, Douglas F. – Universal Journal of Educational Research, 2015
This manuscript examines value added measures used in teacher evaluations. The evaluations are often based on limited observations and use student growth as measured by standardized tests. These measures typically do not use multiple measures or consider other factors in the teaching and learning process. This manuscript identifies some of the…
Descriptors: Teacher Evaluation, Use Studies, Relevance (Education), Outcome Measures
Ho, Andrew D. – Teachers College Record, 2014
Background/Context: The target of assessment validation is not an assessment but the use of an assessment for a purpose. Although the validation literature often provides examples of assessment purposes, comprehensive reviews of these purposes are rare. Additionally, assessment purposes posed for validation are generally described as discrete and…
Descriptors: Elementary Secondary Education, Standardized Tests, Measurement Objectives, Educational Change
Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason – Practical Assessment, Research & Evaluation, 2013
In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…
Descriptors: Multiple Regression Analysis, Misconceptions, Reader Response, Predictor Variables
Bowman, Nicholas A. – Research & Practice in Assessment, 2013
Asking college students how much they have learned or grown is a common assessment practice in student affairs and elsewhere. Unfortunately, recent research suggests that these self-reported gains do a very poor job of measuring actual student learning and growth. This paper provides an overview of the psychological process of how students likely…
Descriptors: College Students, Student Development, Student Improvement, Achievement Gains
Zimmer, Ron; Gill, Brian; Booker, Kevin; Lavertu, Stephane; Witte, John – Economics of Education Review, 2012
Since their inception, charter schools have been a lighting rod for controversy, with much of the debate revolving around their effectiveness in improving student achievement. Previous research has shown mixed results for student achievement; this could be the consequence of different policy environments or varying methodological approaches with…
Descriptors: Charter Schools, Academic Achievement, School Effectiveness, Educational Improvement
Harris, Douglas N.; Anderson, Andrew – Carnegie Foundation for the Advancement of Teaching, 2013
There is a growing body of research on the validity and reliability of value-added measures, but most of this research has focused on elementary grades. Driven by several federal initiatives such as Race to the Top, Teacher Incentive Fund, and ESEA waivers, however, many states have incorporated value-added measures into the evaluations not only…
Descriptors: Teacher Effectiveness, Teacher Evaluation, Evaluation Methods, Evaluation Research
Mislevy, Robert J.; Haertel, Geneva; Cheng, Britte H.; Ructtinger, Liliana; DeBarger, Angela; Murray, Elizabeth; Rose, David; Gravel, Jenna; Colker, Alexis M.; Rutstein, Daisy; Vendlinski, Terry – Educational Research and Evaluation, 2013
Standardizing aspects of assessments has long been recognized as a tactic to help make evaluations of examinees fair. It reduces variation in irrelevant aspects of testing procedures that could advantage some examinees and disadvantage others. However, recent attention to making assessment accessible to a more diverse population of students…
Descriptors: Testing Accommodations, Access to Education, Testing, Psychometrics
Camilli, Gregory – Educational Research and Evaluation, 2013
In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…
Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format

Hux, Karen; And Others – Journal of Communication Disorders, 1997
A study evaluated and compared four methods of assessing reliability on one discourse analysis procedure--a modified version of Damico's Clinical Discourse Analysis. The methods were Pearson product-moment correlations; interobserver agreement; Cohen's kappa; and generalizability coefficients. The strengths and weaknesses of the methods are…
Descriptors: Communication Disorders, Discourse Analysis, Evaluation Methods, Evaluation Problems
Athanasou, James A. – Australian Journal of Adult Learning, 2005
This paper focuses on two key aspects of self-evaluation in adult education and training through the perspective of (a) a social-cognitive framework which is used to categorise those factors that enhance self-efficacy and self-evaluation, and (b) the accuracy of self-evaluation. The social-cognitive framework categorises the factors that enhance…
Descriptors: Self Efficacy, Adult Education, Self Evaluation (Individuals), Social Cognition
Shek, Daniel T. L.; Tang, Vera M. Y.; Han, X. Y. – Research on Social Work Practice, 2005
Objective: This study examines the quality of evaluation studies using qualitative research methods in the social work literature in terms of a number of criteria commonly adopted in the field of qualitative research. Method: Using qualitative and evaluation as search terms, relevant qualitative evaluation studies from 1990 to 2003 indexed by…
Descriptors: Qualitative Research, Research Methodology, Social Work, Evaluation Research
Fawkes, Don; O'meara, Bill; Weber, Dave; Flage, Dan – Science & Education, 2005
This paper examines the content of The California Critical Thinking Skills Test (1990). This report is not a statistical review. Instead it brings under scrutiny the content of the exam. This content will be of interest to the general reader, because the issues range from logic to ethics to pedagogy, and to questions of evidential and…
Descriptors: Critical Thinking, Thinking Skills, Test Content, Testing Problems