NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
Assessments and Surveys
Stanford Achievement Tests1
What Works Clearinghouse Rating
Showing all 9 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Stefanie A. Wind; Yangmeng Xu – Educational Assessment, 2024
We explored three approaches to resolving or re-scoring constructed-response items in mixed-format assessments: rater agreement, person fit, and targeted double scoring (TDS). We used a simulation study to consider how the three approaches impact the psychometric properties of student achievement estimates, with an emphasis on person fit. We found…
Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Examiners
Topczewski, Anna Marie – ProQuest LLC, 2013
Developmental score scales represent the performance of students along a continuum, where as students learn more they move higher along that continuum. Unidimensional item response theory (UIRT) vertical scaling has become a commonly used method to create developmental score scales. Research has shown that UIRT vertical scaling methods can be…
Descriptors: Item Response Theory, Scaling, Scores, Student Development
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Gorad, Stephen; Hordosy, Rita; Siddiqui, Nadia – International Education Studies, 2013
This paper re-considers the widespread use of value-added approaches to estimate school "effects", and shows the results to be very unstable over time. The paper uses as an example the contextualised value-added scores of all secondary schools in England. The study asks how many schools with at least 99% of their pupils included in the…
Descriptors: Foreign Countries, Outcomes of Education, Secondary Education, Educational Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Boyd, Donald; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – Journal of Educational and Behavioral Statistics, 2013
Test-based accountability as well as value-added asessments and much experimental and quasi-experimental research in education rely on achievement tests to measure student skills and knowledge. Yet, we know little regarding fundamental properties of these tests, an important example being the extent of measurement error and its implications for…
Descriptors: Accountability, Educational Research, Educational Testing, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Haberman, Shelby J. – Journal of Educational and Behavioral Statistics, 2008
In educational tests, subscores are often generated from a portion of the items in a larger test. Guidelines based on mean squared error are proposed to indicate whether subscores are worth reporting. Alternatives considered are direct reports of subscores, estimates of subscores based on total score, combined estimates based on subscores and…
Descriptors: Testing Programs, Regression (Statistics), Scores, Student Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Papay, John P. – American Educational Research Journal, 2011
Recently, educational researchers and practitioners have turned to value-added models to evaluate teacher performance. Although value-added estimates depend on the assessment used to measure student achievement, the importance of outcome selection has received scant attention in the literature. Using data from a large, urban school district, I…
Descriptors: Urban Schools, Teacher Effectiveness, Reading Achievement, Achievement Tests
Hanushek, Eric A.; Rivkin, Steven G. – National Center for Analysis of Longitudinal Data in Education Research, 2010
Extensive education research on the contribution of teachers to student achievement produces two generally accepted results. First, teacher quality varies substantially as measured by the value added to student achievement or future academic attainment or earnings. Second, variables often used to determine entry into the profession and…
Descriptors: Credentials, Teacher Effectiveness, Models, Teacher Qualifications
Peer reviewed Peer reviewed
Direct linkDirect link
Kluge, Annette – Applied Psychological Measurement, 2008
The use of microworlds (MWs), or complex dynamic systems, in educational testing and personnel selection is hampered by systematic measurement errors because these new and innovative item formats are not adequately controlled for their difficulty. This empirical study introduces a way to operationalize an MW's difficulty and demonstrates the…
Descriptors: Personnel Selection, Self Efficacy, Educational Testing, Computer Uses in Education
Boyd, Donald; Grossman, Pamela; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – National Center for Analysis of Longitudinal Data in Education Research, 2008
Value-added models in education research allow researchers to explore how a wide variety of policies and measured school inputs affect the academic performance of students. Researchers typically quantify the impacts of such interventions in terms of "effect sizes", i.e., the estimated effect of a one standard deviation change in the…
Descriptors: Credentials, Teacher Effectiveness, Models, Teacher Qualifications