NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
No Child Left Behind Act 20011
What Works Clearinghouse Rating
Does not meet standards1
Showing 1 to 15 of 23 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
O'Donnell, Francis; Sireci, Stephen G. – Educational Assessment, 2022
Since the standards-based assessment practices required by the No Child Left Behind legislation, almost all students in the United States are "labeled" according to their performance on educational achievement tests. In spite of their widespread use in reporting test results, research on how achievement level labels are perceived by…
Descriptors: Teacher Attitudes, Parent Attitudes, Academic Achievement, Achievement Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Sireci, Stephen G. – Educational Measurement: Issues and Practice, 2020
Educational tests are standardized so that all examinees are tested on the same material, under the same testing conditions, and with the same scoring protocols. This uniformity is designed to provide a level "playing field" for all examinees so that the test is "the same" for everyone. Thus, standardization is designed to…
Descriptors: Standards, Educational Assessment, Culture Fair Tests, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Wells, Craig S.; Sireci, Stephen G. – Applied Measurement in Education, 2020
Student growth percentiles (SGPs) are currently used by several states and school districts to provide information about individual students as well as to evaluate teachers, schools, and school districts. For SGPs to be defensible for these purposes, they should be reliable. In this study, we examine the amount of systematic and random error in…
Descriptors: Growth Models, Reliability, Scores, Error Patterns
Noble, Tracy; Sireci, Stephen G.; Wells, Craig S.; Kachchaf, Rachel R.; Rosebery, Ann S.; Wang, Yang Caroline – American Educational Research Journal, 2020
In this experimental study, 20 multiple-choice test items from the Massachusetts Grade 5 science test were linguistically simplified, and original and simplified test items were administered to 310 English learners (ELs) and 1,580 non-ELs in four Massachusetts school districts. This study tested the hypothesis that specific linguistic features of…
Descriptors: Science Tests, Language Usage, English Language Learners, School Districts
Peer reviewed Peer reviewed
Direct linkDirect link
Faulkner-Bond, Molly; Wolf, Mikyung Kim; Wells, Craig S.; Sireci, Stephen G. – Language Assessment Quarterly, 2018
In this study we investigated the internal factor structure of a large-scale K--12 assessment of English language proficiency (ELP) using samples of fourth- and eighth-grade English learners (ELs) in one state. While U.S. schools are mandated to measure students' ELP in four language domains (listening, reading, speaking, and writing), some ELP…
Descriptors: Factor Structure, Language Tests, Language Proficiency, Grade 4
Peer reviewed Peer reviewed
Direct linkDirect link
Sireci, Stephen G. – Assessment in Education: Principles, Policy & Practice, 2016
A misconception exists that validity may refer only to the "interpretation" of test scores and not to the "uses" of those scores. The development and evolution of validity theory illustrate test score interpretation was a primary focus in the earliest days of modern testing, and that validating interpretations derived from test…
Descriptors: Test Validity, Misconceptions, Evaluation Utilization, Data Interpretation
Peer reviewed Peer reviewed
Direct linkDirect link
Gökçe, Semirhan; Berberoglu, Giray; Wells, Craig S.; Sireci, Stephen G. – Journal of Psychoeducational Assessment, 2021
The 2015 Trends in International Mathematics and Science Study (TIMSS) involved 57 countries and 43 different languages to assess students' achievement in mathematics and science. The purpose of this study is to evaluate whether items and test scores are affected as the differences between language families and cultures increase. Using…
Descriptors: Language Classification, Elementary Secondary Education, Mathematics Achievement, Mathematics Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Sireci, Stephen G. – Journal of Educational Measurement, 2013
Kane (this issue) presents a comprehensive review of validity theory and reminds us that the focus of validation is on test score interpretations and use. In reacting to his article, I support the argument-based approach to validity and all of the major points regarding validation made by Dr. Kane. In addition, I call for a simpler, three-step…
Descriptors: Validity, Theories, Test Interpretation, Test Use
Peer reviewed Peer reviewed
Direct linkDirect link
Han, Kyung T.; Wells, Craig S.; Sireci, Stephen G. – Applied Measurement in Education, 2012
Item parameter drift (IPD) occurs when item parameter values change from their original value over time. IPD may pose a serious threat to the fairness and validity of test score interpretations, especially when the goal of the assessment is to measure growth or improvement. In this study, we examined the effect of multidirectional IPD (i.e., some…
Descriptors: Item Response Theory, Test Items, Scaling, Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Sireci, Stephen G.; Rios, Joseph A. – Educational Research and Evaluation, 2013
There are numerous statistical procedures for detecting items that function differently across subgroups of examinees that take a test or survey. However, in endeavouring to detect items that may function differentially, selection of the statistical method is only one of many important decisions. In this article, we discuss the important decisions…
Descriptors: Effect Size, Test Bias, Item Analysis, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Chulu, Bob Wajizigha; Sireci, Stephen G. – International Journal of Testing, 2011
Many examination agencies, policy makers, media houses, and the public at large make high-stakes decisions based on test scores. Unfortunately, in some cases educational tests are not statistically equated to account for test differences over time, which leads to inappropriate interpretations of students' performance. In this study we illustrate…
Descriptors: Classification, Foreign Countries, Item Response Theory, High Stakes Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Hambleton, Ronald K.; Sireci, Stephen G.; Smith, Zachary R. – Applied Measurement in Education, 2009
In this study, we mapped achievement levels from the National Assessment of Educational Progress (NAEP) onto the score scales for selected assessments from the Trends in International Mathematics and Science Study (TIMSS) and the Program for International Student Achievement (PISA). The mapping was conducted on NAEP, TIMSS, and PISA Mathematics…
Descriptors: National Competency Tests, Mathematics Achievement, Mathematics Tests, Comparative Analysis
Luecht, Richard M.; Sireci, Stephen G. – College Board, 2011
Over the past four decades, there has been incremental growth in computer-based testing (CBT) as a viable alternative to paper-and-pencil testing. However, the transition to CBT is neither easy nor inexpensive. As Drasgow, Luecht, and Bennett (2006) noted, many design engineering, test development, operations/logistics, and psychometric changes…
Descriptors: College Entrance Examinations, Computer Assisted Testing, Educational Technology, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Zenisky, April L.; Hambleton, Ronald K.; Sireci, Stephen G. – Applied Measurement in Education, 2009
How a testing agency approaches score reporting can have a significant impact on the perception of that assessment and the usefulness of the information among intended users and stakeholders. Too often, important decisions about reporting test data are left to the end of the test development cycle, but by considering the audience(s) and the kinds…
Descriptors: National Competency Tests, Scores, Test Results, Information Dissemination
Peer reviewed Peer reviewed
Direct linkDirect link
Lu, Ying; Sireci, Stephen G. – Educational Measurement: Issues and Practice, 2007
Speededness refers to the situation where the time limits on a standardized test do not allow substantial numbers of examinees to fully consider all test items. When tests are not intended to measure speed of responding, speededness introduces a severe threat to the validity of interpretations based on test scores. In this article, we describe…
Descriptors: Test Items, Timed Tests, Standardized Tests, Test Validity
Previous Page | Next Page »
Pages: 1  |  2