ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	9

Descriptor

Correlation	9
Educational Testing	9
Error of Measurement	9
Scores	7
Educational Policy	4
Effect Size	4
Longitudinal Studies	4
Achievement Gains	3
Educational Research	3
Evaluation Problems	3
Measurement	3
Predictor Variables	3
Student Evaluation	3
Teacher Effectiveness	3
Teacher Evaluation	3
Academic Achievement	2
Achievement Tests	2
Computation	2
Credentials	2
Educational Assessment	2
Evaluation Methods	2
Foreign Countries	2
Generalizability Theory	2
High Stakes Tests	2
Item Response Theory	2
More ▼

Source

Journal of Educational and…	2
National Center for Analysis…	2
American Educational Research…	1
Applied Psychological…	1
Educational Assessment	1
International Education…	1
ProQuest LLC	1

Author

Boyd, Donald	2
Lankford, Hamilton	2
Loeb, Susanna	2
Wyckoff, James	2
Gorad, Stephen	1
Grossman, Pamela	1
Haberman, Shelby J.	1
Hanushek, Eric A.	1
Hordosy, Rita	1
Kluge, Annette	1
Papay, John P.	1
Rivkin, Steven G.	1
Siddiqui, Nadia	1
Stefanie A. Wind	1
Topczewski, Anna Marie	1
Yangmeng Xu	1
More ▼

Publication Type

Journal Articles	6
Reports - Evaluative	5
Reports - Research	3
Dissertations/Theses -…	1
Speeches/Meeting Papers	1

Education Level

Elementary Secondary Education	3
Grade 3	2
Grade 4	2
Grade 5	2
Elementary Education	1
Grade 6	1
Grade 7	1
Grade 8	1
Secondary Education	1

Audience

Location

New York	3
California	1
Germany	1
Illinois	1
New Jersey	1
North Carolina	1
Tennessee	1
Texas	1
United Kingdom (England)	1

Laws, Policies, & Programs

Assessments and Surveys

Stanford Achievement Tests

What Works Clearinghouse Rating

Showing all 9 results Save | Export

Resolving and Re-Scoring Constructed Response Items in Mixed-Format Assessments: An Exploration of Three Approaches

Peer reviewed

Direct link

Stefanie A. Wind; Yangmeng Xu – Educational Assessment, 2024

We explored three approaches to resolving or re-scoring constructed-response items in mixed-format assessments: rater agreement, person fit, and targeted double scoring (TDS). We used a simulation study to consider how the three approaches impact the psychometric properties of student achievement estimates, with an emphasis on person fit. We found…

Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Examiners

Effect of Violating Unidimensional Item Response Theory Vertical Scaling Assumptions on Developmental Score Scales

Direct link

Topczewski, Anna Marie – ProQuest LLC, 2013

Developmental score scales represent the performance of students along a continuum, where as students learn more they move higher along that continuum. Unidimensional item response theory (UIRT) vertical scaling has become a commonly used method to create developmental score scales. Research has shown that UIRT vertical scaling methods can be…

Descriptors: Item Response Theory, Scaling, Scores, Student Development

How Unstable Are "School Effects" Assessed by a Value-Added Technique?

Peer reviewed
PDF on ERIC

Download full text

Gorad, Stephen; Hordosy, Rita; Siddiqui, Nadia – International Education Studies, 2013

This paper re-considers the widespread use of value-added approaches to estimate school "effects", and shows the results to be very unstable over time. The paper uses as an example the contextualised value-added scores of all secondary schools in England. The study asks how many schools with at least 99% of their pupils included in the…

Descriptors: Foreign Countries, Outcomes of Education, Secondary Education, Educational Testing

Measuring Test Measurement Error: A General Approach

Peer reviewed

Direct link

Boyd, Donald; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – Journal of Educational and Behavioral Statistics, 2013

Test-based accountability as well as value-added asessments and much experimental and quasi-experimental research in education rely on achievement tests to measure student skills and knowledge. Yet, we know little regarding fundamental properties of these tests, an important example being the extent of measurement error and its implications for…

Descriptors: Accountability, Educational Research, Educational Testing, Error of Measurement

When Can Subscores Have Value?

Peer reviewed

Direct link

Haberman, Shelby J. – Journal of Educational and Behavioral Statistics, 2008

In educational tests, subscores are often generated from a portion of the items in a larger test. Guidelines based on mean squared error are proposed to indicate whether subscores are worth reporting. Alternatives considered are direct reports of subscores, estimates of subscores based on total score, combined estimates based on subscores and…

Descriptors: Testing Programs, Regression (Statistics), Scores, Student Evaluation

Different Tests, Different Answers: The Stability of Teacher Value-Added Estimates across Outcome Measures

Peer reviewed

Direct link

Papay, John P. – American Educational Research Journal, 2011

Recently, educational researchers and practitioners have turned to value-added models to evaluate teacher performance. Although value-added estimates depend on the assessment used to measure student achievement, the importance of outcome selection has received scant attention in the literature. Using data from a large, urban school district, I…

Descriptors: Urban Schools, Teacher Effectiveness, Reading Achievement, Achievement Tests

Using Value-Added Measures of Teacher Quality. Brief 9

Download full text

Hanushek, Eric A.; Rivkin, Steven G. – National Center for Analysis of Longitudinal Data in Education Research, 2010

Extensive education research on the contribution of teachers to student achievement produces two generally accepted results. First, teacher quality varies substantially as measured by the value added to student achievement or future academic attainment or earnings. Second, variables often used to determine entry into the profession and…

Descriptors: Credentials, Teacher Effectiveness, Models, Teacher Qualifications

Performance Assessments with Microworlds and Their Difficulty

Peer reviewed

Direct link

Kluge, Annette – Applied Psychological Measurement, 2008

The use of microworlds (MWs), or complex dynamic systems, in educational testing and personnel selection is hampered by systematic measurement errors because these new and innovative item formats are not adequately controlled for their difficulty. This empirical study introduces a way to operationalize an MW's difficulty and demonstrates the…

Descriptors: Personnel Selection, Self Efficacy, Educational Testing, Computer Uses in Education

Measuring Effect Sizes: The Effect of Measurement Error. Working Paper 19

Download full text

Boyd, Donald; Grossman, Pamela; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – National Center for Analysis of Longitudinal Data in Education Research, 2008

Value-added models in education research allow researchers to explore how a wide variety of policies and measured school inputs affect the academic performance of students. Researchers typically quantify the impacts of such interventions in terms of "effect sizes", i.e., the estimated effect of a one standard deviation change in the…

Descriptors: Credentials, Teacher Effectiveness, Models, Teacher Qualifications