Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 18 |
Descriptor
Scores | 23 |
Test Items | 6 |
Educational Assessment | 5 |
Standardized Tests | 5 |
Item Analysis | 4 |
Mathematics Tests | 4 |
Test Construction | 4 |
Test Use | 4 |
Academic Achievement | 3 |
Achievement Tests | 3 |
Comparative Analysis | 3 |
More ▼ |
Source
Author
Sireci, Stephen G. | 23 |
Wells, Craig S. | 6 |
Hambleton, Ronald K. | 3 |
Han, Kyung T. | 2 |
Berberoglu, Giray | 1 |
Chulu, Bob Wajizigha | 1 |
Faulkner-Bond, Molly | 1 |
Gökçe, Semirhan | 1 |
Kachchaf, Rachel R. | 1 |
Li, Shuhong | 1 |
Lu, Ying | 1 |
More ▼ |
Publication Type
Journal Articles | 21 |
Reports - Research | 10 |
Reports - Evaluative | 7 |
Reports - Descriptive | 3 |
Opinion Papers | 2 |
Information Analyses | 1 |
Education Level
Elementary Secondary Education | 4 |
Elementary Education | 3 |
Middle Schools | 3 |
Grade 4 | 2 |
Grade 5 | 2 |
Higher Education | 2 |
Intermediate Grades | 2 |
Early Childhood Education | 1 |
Grade 3 | 1 |
Grade 6 | 1 |
Grade 8 | 1 |
More ▼ |
Audience
Location
Massachusetts | 2 |
Malawi | 1 |
New Jersey | 1 |
North Carolina | 1 |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
Advanced Placement… | 1 |
Graduate Management Admission… | 1 |
Preliminary Scholastic… | 1 |
Program for International… | 1 |
SAT (College Admission Test) | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Does not meet standards | 1 |
O'Donnell, Francis; Sireci, Stephen G. – Educational Assessment, 2022
Since the standards-based assessment practices required by the No Child Left Behind legislation, almost all students in the United States are "labeled" according to their performance on educational achievement tests. In spite of their widespread use in reporting test results, research on how achievement level labels are perceived by…
Descriptors: Teacher Attitudes, Parent Attitudes, Academic Achievement, Achievement Tests
Sireci, Stephen G. – Educational Measurement: Issues and Practice, 2020
Educational tests are standardized so that all examinees are tested on the same material, under the same testing conditions, and with the same scoring protocols. This uniformity is designed to provide a level "playing field" for all examinees so that the test is "the same" for everyone. Thus, standardization is designed to…
Descriptors: Standards, Educational Assessment, Culture Fair Tests, Scoring
Wells, Craig S.; Sireci, Stephen G. – Applied Measurement in Education, 2020
Student growth percentiles (SGPs) are currently used by several states and school districts to provide information about individual students as well as to evaluate teachers, schools, and school districts. For SGPs to be defensible for these purposes, they should be reliable. In this study, we examine the amount of systematic and random error in…
Descriptors: Growth Models, Reliability, Scores, Error Patterns
Noble, Tracy; Sireci, Stephen G.; Wells, Craig S.; Kachchaf, Rachel R.; Rosebery, Ann S.; Wang, Yang Caroline – American Educational Research Journal, 2020
In this experimental study, 20 multiple-choice test items from the Massachusetts Grade 5 science test were linguistically simplified, and original and simplified test items were administered to 310 English learners (ELs) and 1,580 non-ELs in four Massachusetts school districts. This study tested the hypothesis that specific linguistic features of…
Descriptors: Science Tests, Language Usage, English Language Learners, School Districts
Faulkner-Bond, Molly; Wolf, Mikyung Kim; Wells, Craig S.; Sireci, Stephen G. – Language Assessment Quarterly, 2018
In this study we investigated the internal factor structure of a large-scale K--12 assessment of English language proficiency (ELP) using samples of fourth- and eighth-grade English learners (ELs) in one state. While U.S. schools are mandated to measure students' ELP in four language domains (listening, reading, speaking, and writing), some ELP…
Descriptors: Factor Structure, Language Tests, Language Proficiency, Grade 4
Sireci, Stephen G. – Assessment in Education: Principles, Policy & Practice, 2016
A misconception exists that validity may refer only to the "interpretation" of test scores and not to the "uses" of those scores. The development and evolution of validity theory illustrate test score interpretation was a primary focus in the earliest days of modern testing, and that validating interpretations derived from test…
Descriptors: Test Validity, Misconceptions, Evaluation Utilization, Data Interpretation
Gökçe, Semirhan; Berberoglu, Giray; Wells, Craig S.; Sireci, Stephen G. – Journal of Psychoeducational Assessment, 2021
The 2015 Trends in International Mathematics and Science Study (TIMSS) involved 57 countries and 43 different languages to assess students' achievement in mathematics and science. The purpose of this study is to evaluate whether items and test scores are affected as the differences between language families and cultures increase. Using…
Descriptors: Language Classification, Elementary Secondary Education, Mathematics Achievement, Mathematics Tests
Sireci, Stephen G. – Journal of Educational Measurement, 2013
Kane (this issue) presents a comprehensive review of validity theory and reminds us that the focus of validation is on test score interpretations and use. In reacting to his article, I support the argument-based approach to validity and all of the major points regarding validation made by Dr. Kane. In addition, I call for a simpler, three-step…
Descriptors: Validity, Theories, Test Interpretation, Test Use
Han, Kyung T.; Wells, Craig S.; Sireci, Stephen G. – Applied Measurement in Education, 2012
Item parameter drift (IPD) occurs when item parameter values change from their original value over time. IPD may pose a serious threat to the fairness and validity of test score interpretations, especially when the goal of the assessment is to measure growth or improvement. In this study, we examined the effect of multidirectional IPD (i.e., some…
Descriptors: Item Response Theory, Test Items, Scaling, Methods
Sireci, Stephen G.; Rios, Joseph A. – Educational Research and Evaluation, 2013
There are numerous statistical procedures for detecting items that function differently across subgroups of examinees that take a test or survey. However, in endeavouring to detect items that may function differentially, selection of the statistical method is only one of many important decisions. In this article, we discuss the important decisions…
Descriptors: Effect Size, Test Bias, Item Analysis, Statistical Analysis
Chulu, Bob Wajizigha; Sireci, Stephen G. – International Journal of Testing, 2011
Many examination agencies, policy makers, media houses, and the public at large make high-stakes decisions based on test scores. Unfortunately, in some cases educational tests are not statistically equated to account for test differences over time, which leads to inappropriate interpretations of students' performance. In this study we illustrate…
Descriptors: Classification, Foreign Countries, Item Response Theory, High Stakes Tests
Hambleton, Ronald K.; Sireci, Stephen G.; Smith, Zachary R. – Applied Measurement in Education, 2009
In this study, we mapped achievement levels from the National Assessment of Educational Progress (NAEP) onto the score scales for selected assessments from the Trends in International Mathematics and Science Study (TIMSS) and the Program for International Student Achievement (PISA). The mapping was conducted on NAEP, TIMSS, and PISA Mathematics…
Descriptors: National Competency Tests, Mathematics Achievement, Mathematics Tests, Comparative Analysis
Luecht, Richard M.; Sireci, Stephen G. – College Board, 2011
Over the past four decades, there has been incremental growth in computer-based testing (CBT) as a viable alternative to paper-and-pencil testing. However, the transition to CBT is neither easy nor inexpensive. As Drasgow, Luecht, and Bennett (2006) noted, many design engineering, test development, operations/logistics, and psychometric changes…
Descriptors: College Entrance Examinations, Computer Assisted Testing, Educational Technology, Evaluation Methods
Zenisky, April L.; Hambleton, Ronald K.; Sireci, Stephen G. – Applied Measurement in Education, 2009
How a testing agency approaches score reporting can have a significant impact on the perception of that assessment and the usefulness of the information among intended users and stakeholders. Too often, important decisions about reporting test data are left to the end of the test development cycle, but by considering the audience(s) and the kinds…
Descriptors: National Competency Tests, Scores, Test Results, Information Dissemination
Lu, Ying; Sireci, Stephen G. – Educational Measurement: Issues and Practice, 2007
Speededness refers to the situation where the time limits on a standardized test do not allow substantial numbers of examinees to fully consider all test items. When tests are not intended to measure speed of responding, speededness introduces a severe threat to the validity of interpretations based on test scores. In this article, we describe…
Descriptors: Test Items, Timed Tests, Standardized Tests, Test Validity
Previous Page | Next Page »
Pages: 1 | 2