NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 7 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Esarey, Justin; Valdes, Natalie – Assessment & Evaluation in Higher Education, 2020
Scholarly debate about student evaluations of teaching (SETs) often focuses on whether SETs are valid, reliable and unbiased. In this article, we assume the most optimistic conditions for SETs that are supported by the empirical literature. Specifically, we assume that SETs are moderately correlated with teaching quality (student learning and…
Descriptors: Student Evaluation of Teacher Performance, Bias, Reliability, Validity
Amrein-Beardsley, Audrey; Pivovarova, Margarita; Geiger, Tray J. – Phi Delta Kappan, 2016
Being an expert involves explaining how things are supposed to work, and, perhaps more important, why things might not work as supposed. In this study, researchers surveyed scholars with expertise in value-added models (VAMs) to solicit their opinions about the uses and potential of VAMs for teacher-level accountability purposes (for example, in…
Descriptors: Value Added Models, Scholarship, Expertise, Surveys
Peer reviewed Peer reviewed
Direct linkDirect link
Culpepper, Steven Andrew – Applied Psychological Measurement, 2012
Measurement error significantly biases interaction effects and distorts researchers' inferences regarding interactive hypotheses. This article focuses on the single-indicator case and shows how to accurately estimate group slope differences by disattenuating interaction effects with errors-in-variables (EIV) regression. New analytic findings were…
Descriptors: Evidence, Test Length, Interaction, Regression (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Schmitt, T. A.; Sass, D. A.; Sullivan, J. R.; Walker, C. M. – International Journal of Testing, 2010
Imposed time limits on computer adaptive tests (CATs) can result in examinees having difficulty completing all items, thus compromising the validity and reliability of ability estimates. In this study, the effects of speededness were explored in a simulated CAT environment by varying examinee response patterns to end-of-test items. Expectedly,…
Descriptors: Monte Carlo Methods, Simulation, Computer Assisted Testing, Adaptive Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Betebenner, Damian W.; Shang, Yi; Xiang, Yun; Zhao, Yan; Yue, Xiaohui – Journal of Educational Measurement, 2008
No Child Left Behind (NCLB) performance mandates, embedded within state accountability systems, focus school AYP (adequate yearly progress) compliance squarely on the percentage of students at or above proficient. The singular importance of this quantity for decision-making purposes has initiated extensive research into percent proficient as a…
Descriptors: Classification, Error of Measurement, Statistics, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Raudenbush, Stephen W.; Sadoff, Sally – Journal of Research on Educational Effectiveness, 2008
A dramatic shift in research priorities has recently produced a large number of ambitious randomized trials in K-12 education. In most cases, the aim is to improve student academic learning by improving classroom instruction. Embedded in these studies are theories about how the quality of classroom must improve if these interventions are to…
Descriptors: Elementary Secondary Education, Error of Measurement, Statistical Inference, Program Evaluation
Peer reviewed Peer reviewed
Cornwell, John M.; Ladd, Robert T. – Educational and Psychological Measurement, 1993
Simulated data typical of those from meta analyses are used to evaluate the reliability, Type I and Type II errors, bias, and standard error of the meta-analytic procedures of Schmidt and Hunter (1977). Concerns about power, reliability, and Type I errors are presented. (SLD)
Descriptors: Bias, Computer Simulation, Correlation, Effect Size