Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 5 |
Descriptor
Decision Making | 6 |
Test Validity | 5 |
Test Construction | 3 |
Computer Assisted Testing | 2 |
Cutting Scores | 2 |
Mathematics Tests | 2 |
Test Use | 2 |
Academic Language | 1 |
Accountability | 1 |
Achievement Gap | 1 |
Achievement Tests | 1 |
More ▼ |
Source
Educational Measurement:… | 6 |
Author
Abedi, Jamal | 1 |
Arslan, Burcu | 1 |
Dunbar, Stephen B. | 1 |
Gong, Tao | 1 |
Haberman, Shelby | 1 |
Jiang, Yang | 1 |
Katz, Irvin R. | 1 |
Keehner, Madeleine | 1 |
Lee, Hansol | 1 |
Madaus, George F. | 1 |
Meng, Yu | 1 |
More ▼ |
Publication Type
Journal Articles | 6 |
Reports - Research | 4 |
Opinion Papers | 1 |
Reports - Evaluative | 1 |
Education Level
Elementary Secondary Education | 1 |
Grade 9 | 1 |
High Schools | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Every Student Succeeds Act… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Peabody, Michael R.; Muckle, Timothy J.; Meng, Yu – Educational Measurement: Issues and Practice, 2023
The subjective aspect of standard-setting is often criticized, yet data-driven standard-setting methods are rarely applied. Therefore, we applied a mixture Rasch model approach to setting performance standards across several testing programs of various sizes and compared the results to existing passing standards derived from traditional…
Descriptors: Item Response Theory, Standard Setting, Testing, Sampling
Arslan, Burcu; Jiang, Yang; Keehner, Madeleine; Gong, Tao; Katz, Irvin R.; Yan, Fred – Educational Measurement: Issues and Practice, 2020
Computer-based educational assessments often include items that involve drag-and-drop responses. There are different ways that drag-and-drop items can be laid out and different choices that test developers can make when designing these items. Currently, these decisions are based on experts' professional judgments and design constraints, rather…
Descriptors: Test Items, Computer Assisted Testing, Test Format, Decision Making
Welch, Catherine J.; Dunbar, Stephen B. – Educational Measurement: Issues and Practice, 2020
The use of assessment results to inform school accountability relies on the assumption that the test design appropriately represents the content and cognitive emphasis reflected in the state's standards. Since the passage of the Every Student Succeeds Act and the certification of accountability assessments through federal peer review practices,…
Descriptors: Accountability, Test Construction, State Standards, Content Validity
Abedi, Jamal; Zhang, Yu; Rowe, Susan E.; Lee, Hansol – Educational Measurement: Issues and Practice, 2020
Research indicates that the performance-gap between English Language Learners (ELLs) and their non-ELL peers is partly due to ELLs' difficulty in understanding assessment language. Accommodations have been shown to narrow this performance-gap, but many accommodations studies have not used a randomized design and are based on relatively small…
Descriptors: English Language Learners, Achievement Gap, Mathematics Tests, Standards

Madaus, George F. – Educational Measurement: Issues and Practice, 1986
This reply to William A. Mehrens argues that test validity is the central issue in discussing the appropriate role of tests. It states that the procedures used to establish the validity of tests are inadequate because they depend primarily on content validity and not on construct and criterion validity. (JAZ)
Descriptors: Concurrent Validity, Construct Validity, Cutting Scores, Decision Making
Sinharay, Sandip; Haberman, Shelby; Puhan, Gautam – Educational Measurement: Issues and Practice, 2007
There is an increasing interest in reporting subscores, both at examinee level and at aggregate levels. However, it is important to ensure reasonable subscore performance in terms of high reliability and validity to minimize incorrect instructional and remediation decisions. This article employs a statistical measure based on classical test theory…
Descriptors: Test Reliability, Test Theory, Test Validity, Statistical Analysis