Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 5 |
Descriptor
Evaluation Methods | 9 |
Multidimensional Scaling | 4 |
Test Items | 4 |
Content Validity | 3 |
Test Validity | 3 |
Achievement Tests | 2 |
Construct Validity | 2 |
Cutting Scores | 2 |
Research Methodology | 2 |
Scores | 2 |
Validity | 2 |
More ▼ |
Source
Educational Assessment | 2 |
Educational Measurement:… | 2 |
Applied Measurement in… | 1 |
Applied Psychological… | 1 |
Review of Educational Research | 1 |
Author
Sireci, Stephen G. | 9 |
Wells, Craig S. | 2 |
Zenisky, April L. | 2 |
Bastari, B. | 1 |
Geisinger, Kurt F. | 1 |
Han, Kyung T. | 1 |
Hauger, Jeffrey B. | 1 |
Lewis, Jennifer | 1 |
Lim, Hwanggyu | 1 |
Lu, Ying | 1 |
Martone, Andrea | 1 |
More ▼ |
Publication Type
Journal Articles | 7 |
Reports - Evaluative | 5 |
Speeches/Meeting Papers | 3 |
Reports - Descriptive | 2 |
Reports - Research | 2 |
Education Level
Adult Education | 1 |
Grade 12 | 1 |
High Schools | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Lewis, Jennifer; Lim, Hwanggyu; Padellaro, Frank; Sireci, Stephen G.; Zenisky, April L. – Educational Measurement: Issues and Practice, 2022
Setting cut scores on (MSTs) is difficult, particularly when the test spans several grade levels, and the selection of items from MST panels must reflect the operational test specifications. In this study, we describe, illustrate, and evaluate three methods for mapping panelists' Angoff ratings into cut scores on the scale underlying an MST. The…
Descriptors: Cutting Scores, Adaptive Testing, Test Items, Item Analysis
Lu, Ying; Sireci, Stephen G. – Educational Measurement: Issues and Practice, 2007
Speededness refers to the situation where the time limits on a standardized test do not allow substantial numbers of examinees to fully consider all test items. When tests are not intended to measure speed of responding, speededness introduces a severe threat to the validity of interpretations based on test scores. In this article, we describe…
Descriptors: Test Items, Timed Tests, Standardized Tests, Test Validity
Sireci, Stephen G.; Hauger, Jeffrey B.; Wells, Craig S.; Shea, Christine; Zenisky, April L. – Applied Measurement in Education, 2009
The National Assessment Governing Board used a new method to set achievement level standards on the 2005 Grade 12 NAEP Math test. In this article, we summarize our independent evaluation of the process used to set these standards. The evaluation data included observations of the standard-setting meeting, observations of advisory committee meetings…
Descriptors: Advisory Committees, Mathematics Tests, Standard Setting, National Competency Tests
Martone, Andrea; Sireci, Stephen G. – Review of Educational Research, 2009
The authors (a) discuss the importance of alignment for facilitating proper assessment and instruction, (b) describe the three most common methods for evaluating the alignment between state content standards and assessments, (c) discuss the relative strengths and limitations of these methods, and (d) discuss examples of applications of each…
Descriptors: Teaching Methods, Alignment (Education), Student Evaluation, Curriculum Development
Sireci, Stephen G.; Han, Kyung T.; Wells, Craig S. – Educational Assessment, 2008
In the United States, when English language learners (ELLs) are tested, they are usually tested in English and their limited English proficiency is a potential cause of construct-irrelevant variance. When such irrelevancies affect test scores, inaccurate interpretations of ELLs' knowledge, skills, and abilities may occur. In this article, we…
Descriptors: Test Use, Educational Assessment, Psychological Testing, Validity

Sireci, Stephen G.; Geisinger, Kurt F. – Applied Psychological Measurement, 1995
An expanded version of the method of content evaluation proposed by S. G. Sireci and K. F. Giesinger (1992) was evaluated with respect to a national licensure examination and a nationally standardized social studies achievement test. Two groups of 15 subject-matter experts rated the similarity and content relevance of the items. (SLD)
Descriptors: Achievement Tests, Cluster Analysis, Construct Validity, Content Validity
Sireci, Stephen G. – 1998
Multidimensional scaling (MDS) is a versatile technique for understanding the structure of multivariate data. Recent studies have applied MDS to the problem of evaluating content validity. This paper describes the importance of evaluating test content and the logic of using MDS to analyze data gathered from subject matter experts employed in…
Descriptors: Content Validity, Evaluation Methods, Multidimensional Scaling, Research Methodology

Sireci, Stephen G. – Educational Assessment, 1998
Describes content-validity theory and illustrates new and traditional approaches for conducting content-validity studies. Newer approaches are based on multidimensional scaling analysis of item-similarity ratings, while traditional approaches are based on ratings of item-objective congruence and relevance. (Author/SLD)
Descriptors: Content Validity, Data Analysis, Evaluation Methods, Multidimensional Scaling
Sireci, Stephen G.; Bastari, B. – 1998
In many cross-cultural research studies, assessment instruments are translated or adapted for use in multiple languages. However, it cannot be assumed that different language versions of an assessment are equivalent across languages. A fundamental issue to be addressed is the comparability or equivalence of the construct measured by each language…
Descriptors: Construct Validity, Cross Cultural Studies, Evaluation Methods, Multidimensional Scaling