NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
No Child Left Behind Act 20011
Assessments and Surveys
What Works Clearinghouse Rating
Showing all 11 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Bang Quan Zheng; Peter M. Bentler – Structural Equation Modeling: A Multidisciplinary Journal, 2025
This paper aims to advocate for a balanced approach to model fit evaluation in structural equation modeling (SEM). The ongoing debate surrounding chi-square test statistics and fit indices has been characterized by ambiguity and controversy. Despite the acknowledged limitations of relying solely on the chi-square test, its careful application can…
Descriptors: Monte Carlo Methods, Structural Equation Models, Goodness of Fit, Robustness (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Ronfeldt, Matthew; Campbell, Shanyce L. – Educational Evaluation and Policy Analysis, 2016
Despite growing calls for more accountability of teacher education programs (TEPs), there is little consensus about how to evaluate them. This study investigates the potential for using observational ratings of program completers to evaluate TEPs. Drawing on statewide data on almost 9,500 program completers, representing 44 providers (183…
Descriptors: Teacher Education Programs, Program Effectiveness, Program Evaluation, Observation
Peer reviewed Peer reviewed
Direct linkDirect link
Ho, Andrew D. – Teachers College Record, 2014
Background/Context: The target of assessment validation is not an assessment but the use of an assessment for a purpose. Although the validation literature often provides examples of assessment purposes, comprehensive reviews of these purposes are rare. Additionally, assessment purposes posed for validation are generally described as discrete and…
Descriptors: Elementary Secondary Education, Standardized Tests, Measurement Objectives, Educational Change
Peer reviewed Peer reviewed
Direct linkDirect link
Rantanen, Pekka – Assessment & Evaluation in Higher Education, 2013
A multilevel analysis approach was used to analyse students' evaluation of teaching (SET). The low value of inter-rater reliability stresses that any solid conclusions on teaching cannot be made on the basis of single feedbacks. To assess a teacher's general teaching effectiveness, one needs to evaluate four randomly chosen course implementations.…
Descriptors: Test Reliability, Feedback (Response), Generalizability Theory, Student Evaluation of Teacher Performance
Peer reviewed Peer reviewed
Direct linkDirect link
Lincove, Jane Arnold; Osborne, Cynthia; Dillon, Amanda; Mills, Nicholas – Journal of Teacher Education, 2014
Despite questions about validity and reliability, the use of value-added estimation methods has moved beyond academic research into state accountability systems for teachers, schools, and teacher preparation programs (TPPs). Prior studies of value-added measurement for TPPs test the validity of researcher-designed models and find that measuring…
Descriptors: Teacher Education Programs, Accountability, Politics of Education, School Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Zimmer, Ron; Gill, Brian; Booker, Kevin; Lavertu, Stephane; Witte, John – Economics of Education Review, 2012
Since their inception, charter schools have been a lighting rod for controversy, with much of the debate revolving around their effectiveness in improving student achievement. Previous research has shown mixed results for student achievement; this could be the consequence of different policy environments or varying methodological approaches with…
Descriptors: Charter Schools, Academic Achievement, School Effectiveness, Educational Improvement
Peer reviewed Peer reviewed
Direct linkDirect link
Camilli, Gregory – Educational Research and Evaluation, 2013
In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…
Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format
Peer reviewed Peer reviewed
Direct linkDirect link
Keeley, Jared W.; English, Taylor; Irons, Jessica; Henslee, Amber M. – Educational and Psychological Measurement, 2013
Many measurement biases affect student evaluations of instruction (SEIs). However, two have been relatively understudied: halo effects and ceiling/floor effects. This study examined these effects in two ways. To examine the halo effect, using a videotaped lecture, we manipulated specific teacher behaviors to be "good" or "bad"…
Descriptors: Robustness (Statistics), Test Bias, Course Evaluation, Student Evaluation of Teacher Performance
Peer reviewed Peer reviewed
Direct linkDirect link
Wright, Robert E. – College Student Journal, 2010
The use of standardized tests for outcome assessment has grown dramatically in recent years. Two driving factors have been the No Child Left Behind legislation, and the increase in outcome assessment measures by accrediting agencies such as AACSB, the international accrediting body for business schools. Despite the growth in usage, little effort…
Descriptors: College Outcomes Assessment, Educational Testing, Standardized Tests, Accreditation (Institutions)
Peer reviewed Peer reviewed
Direct linkDirect link
Erceg-Hurn, David M.; Mirosevich, Vikki M. – American Psychologist, 2008
Classic parametric statistical significance tests, such as analysis of variance and least squares regression, are widely used by researchers in many disciplines, including psychology. For classic parametric tests to produce accurate results, the assumptions underlying them (e.g., normality and homoscedasticity) must be satisfied. These assumptions…
Descriptors: Statistical Significance, Least Squares Statistics, Effect Size, Statistical Studies
Peer reviewed Peer reviewed
Palmer, Emma J.; Hollin, Clive R. – Journal of Adolescence, 1996
Offers practitioners and researchers an overview of two inventories used in the study of antisocial and delinquent behavior. Evidence suggests that the inventories are related to behavioral indices associated with antisocial and delinquent behavior. Concludes that both instruments are robust and can yield valuable results. (RJM)
Descriptors: Adolescents, Delinquency, Evaluation Methods, Measures (Individuals)