NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 10 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Johnson, Evelyn S.; Crawford, Angela R.; Zheng, Yuzhu; Moylan, Laura A. – Educational Measurement: Issues and Practice, 2021
In this study, we compared the results of 27 special education teachers' evaluations using two different observation instruments, the Framework for Teaching (FFT), and the Explicit Instruction observation protocol of the Recognizing Effective Special Education Teachers (RESET) observation system. Results indicate differences in the rank-ordering…
Descriptors: Special Education Teachers, Teacher Evaluation, Teacher Effectiveness, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022
Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…
Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2019
Test score users often demand the reporting of subscores due to their potential diagnostic, remedial, and instructional benefits. Therefore, there is substantial pressure on testing programs to report subscores. However, professional standards require that subscores have to satisfy minimum quality standards before they can be reported. In this…
Descriptors: Testing, Scores, Item Response Theory, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021
Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…
Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making
Peer reviewed Peer reviewed
Direct linkDirect link
Penfield, Randall D.; Gattamorta, Karina; Childs, Ruth A. – Educational Measurement: Issues and Practice, 2009
Traditional methods for examining differential item functioning (DIF) in polytomously scored test items yield a single item-level index of DIF and thus provide no information concerning which score levels are implicated in the DIF effect. To address this limitation of DIF methodology, the framework of differential step functioning (DSF) has…
Descriptors: Test Bias, Test Items, Evaluation Methods, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Lu, Ying; Sireci, Stephen G. – Educational Measurement: Issues and Practice, 2007
Speededness refers to the situation where the time limits on a standardized test do not allow substantial numbers of examinees to fully consider all test items. When tests are not intended to measure speed of responding, speededness introduces a severe threat to the validity of interpretations based on test scores. In this article, we describe…
Descriptors: Test Items, Timed Tests, Standardized Tests, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Haladyna, Thomas M.; Downing, Steven M. – Educational Measurement: Issues and Practice, 2004
There are many threats to validity in high-stakes achievement testing. One major threat is construct-irrelevant variance (CIV). This article defines CIV in the context of the contemporary, unitary view of validity and presents logical arguments, hypotheses, and documentation for a variety of CIV sources that commonly threaten interpretations of…
Descriptors: Student Evaluation, Evaluation Methods, High Stakes Tests, Construct Validity
Peer reviewed Peer reviewed
Linn, Robert L. – Educational Measurement: Issues and Practice, 1997
It is argued that consequential validity is a concept worth considering. The solution to defining "validity" is not to narrow the concept, but to allow for the differential prediction provided by tests in different circumstances. Consequences of the uses and interpretations of test scores are central to their evaluation. (SLD)
Descriptors: Educational Assessment, Educational Testing, Elementary Secondary Education, Evaluation Methods
Peer reviewed Peer reviewed
Kilian, Lawrence J. – Educational Measurement: Issues and Practice, 1992
Guidelines for appropriate test preparation practices are presented to ensure that tests used in high-stakes situations generate scores that represent their domains validly. These guidelines do not rely directly on two evaluative standards proposed by W. J. Popham (1991), although they share the concern for appropriate test preparation. (SLD)
Descriptors: Educational Assessment, Elementary Secondary Education, Ethics, Evaluation Criteria
Peer reviewed Peer reviewed
Koretz, Daniel – Educational Measurement: Issues and Practice, 1992
The documented decline in test scores of the 1960s and 1970s and the unclear picture since then result from educational and noneducational factors. Aspects of the misuse of test scores are (1) simplistic interpretation of performance trends; (2) unsupported evaluations of schooling; and (3) a reductionist view of education. (SLD)
Descriptors: Academic Achievement, Educational Assessment, Educational History, Educational Quality