ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	6

Source

Educational Measurement:…

Publication Type

Journal Articles	10
Reports - Research	4
Reports - Descriptive	3
Opinion Papers	2
Reports - Evaluative	2

Education Level

Audience

Location

Florida	1
Idaho	1
Wisconsin	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Does Special Educator Effectiveness Vary Depending on the Observation Instrument Used?

Peer reviewed

Direct link

Johnson, Evelyn S.; Crawford, Angela R.; Zheng, Yuzhu; Moylan, Laura A. – Educational Measurement: Issues and Practice, 2021

In this study, we compared the results of 27 special education teachers' evaluations using two different observation instruments, the Framework for Teaching (FFT), and the Explicit Instruction observation protocol of the Recognizing Effective Special Education Teachers (RESET) observation system. Results indicate differences in the rank-ordering…

Descriptors: Special Education Teachers, Teacher Evaluation, Teacher Effectiveness, Evaluation Methods

Disrupted Data: Using Longitudinal Assessment Systems to Monitor Test Score Quality

Peer reviewed

Direct link

An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022

Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…

Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies

Digital Module 07: Subscores--Evaluation and Reporting https://ncme.elevate.commpartners.com

Peer reviewed

Direct link

Sinharay, Sandip – Educational Measurement: Issues and Practice, 2019

Test score users often demand the reporting of subscores due to their potential diagnostic, remedial, and instructional benefits. Therefore, there is substantial pressure on testing programs to report subscores. However, professional standards require that subscores have to satisfy minimum quality standards before they can be reported. In this…

Descriptors: Testing, Scores, Item Response Theory, Evaluation Methods

A Model-Data-Fit-Informed Approach to Score Resolution in Performance Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021

Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…

Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making

An NCME Instructional Module on Using Differential Step Functioning to Refine the Analysis of DIF in Polytomous Items

Peer reviewed

Direct link

Penfield, Randall D.; Gattamorta, Karina; Childs, Ruth A. – Educational Measurement: Issues and Practice, 2009

Traditional methods for examining differential item functioning (DIF) in polytomously scored test items yield a single item-level index of DIF and thus provide no information concerning which score levels are implicated in the DIF effect. To address this limitation of DIF methodology, the framework of differential step functioning (DSF) has…

Descriptors: Test Bias, Test Items, Evaluation Methods, Scores

Validity Issues in Test Speededness

Peer reviewed

Direct link

Lu, Ying; Sireci, Stephen G. – Educational Measurement: Issues and Practice, 2007

Speededness refers to the situation where the time limits on a standardized test do not allow substantial numbers of examinees to fully consider all test items. When tests are not intended to measure speed of responding, speededness introduces a severe threat to the validity of interpretations based on test scores. In this article, we describe…

Descriptors: Test Items, Timed Tests, Standardized Tests, Test Validity

Construct-Irrelevant Variance in High-Stakes Testing

Peer reviewed

Direct link

Haladyna, Thomas M.; Downing, Steven M. – Educational Measurement: Issues and Practice, 2004

There are many threats to validity in high-stakes achievement testing. One major threat is construct-irrelevant variance (CIV). This article defines CIV in the context of the contemporary, unitary view of validity and presents logical arguments, hypotheses, and documentation for a variety of CIV sources that commonly threaten interpretations of…

Descriptors: Student Evaluation, Evaluation Methods, High Stakes Tests, Construct Validity

Evaluating the Validity of Assessments: The Consequences of Use.

Peer reviewed

Linn, Robert L. – Educational Measurement: Issues and Practice, 1997

It is argued that consequential validity is a concept worth considering. The solution to defining "validity" is not to narrow the concept, but to allow for the differential prediction provided by tests in different circumstances. Consequences of the uses and interpretations of test scores are central to their evaluation. (SLD)

Descriptors: Educational Assessment, Educational Testing, Elementary Secondary Education, Evaluation Methods

A School District Perspective on Appropriate Test-Preparation Practices: A Reaction to Popham's Proposals.

Peer reviewed

Kilian, Lawrence J. – Educational Measurement: Issues and Practice, 1992

Guidelines for appropriate test preparation practices are presented to ensure that tests used in high-stakes situations generate scores that represent their domains validly. These guidelines do not rely directly on two evaluative standards proposed by W. J. Popham (1991), although they share the concern for appropriate test preparation. (SLD)

Descriptors: Educational Assessment, Elementary Secondary Education, Ethics, Evaluation Criteria

What Happened to Test Scores, and Why?

Peer reviewed

Koretz, Daniel – Educational Measurement: Issues and Practice, 1992

The documented decline in test scores of the 1960s and 1970s and the unclear picture since then result from educational and noneducational factors. Aspects of the misuse of test scores are (1) simplistic interpretation of performance trends; (2) unsupported evaluations of schooling; and (3) a reductionist view of education. (SLD)

Descriptors: Academic Achievement, Educational Assessment, Educational History, Educational Quality

Evaluation Methods	10
Scores	10
Test Validity	4
Educational Assessment	3
Elementary Secondary Education	3
Student Evaluation	3
Test Interpretation	3
Test Items	3
High Stakes Tests	2
Standardized Tests	2
Test Use	2
Testing Problems	2
Academic Achievement	1
Achievement Tests	1
COVID-19	1
Comparative Analysis	1
Construct Validity	1
Correlation	1
Data Collection	1
Decision Making	1
Diagnostic Tests	1
Direct Instruction	1
Documentation	1
Educational History	1
Educational Quality	1
More ▼

An, Lily Shiao	1
Childs, Ruth A.	1
Crawford, Angela R.	1
Davis, Laurie Laughlin	1
Downing, Steven M.	1
Gattamorta, Karina	1
Haladyna, Thomas M.	1
Ho, Andrew Dean	1
Johnson, Evelyn S.	1
Kilian, Lawrence J.	1
Koretz, Daniel	1
Linn, Robert L.	1
Lu, Ying	1
Moylan, Laura A.	1
Penfield, Randall D.	1
Sinharay, Sandip	1
Sireci, Stephen G.	1
Walker, A. Adrienne	1
Wind, Stefanie A.	1
Zheng, Yuzhu	1
More ▼