Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 9 |
Descriptor
Performance Based Assessment | 13 |
Evaluation Methods | 4 |
Scores | 4 |
Scoring | 4 |
Test Items | 4 |
Evaluators | 3 |
Accountability | 2 |
Achievement | 2 |
Comparative Analysis | 2 |
Competence | 2 |
Cutting Scores | 2 |
More ▼ |
Source
Educational Measurement:… | 13 |
Author
Wind, Stefanie A. | 3 |
Baxter, Gail P. | 1 |
Cai, Jinfa | 1 |
Cook, Robert | 1 |
Crisp, Victoria | 1 |
Evans, Carla M. | 1 |
Faulkner-Bond, Molly | 1 |
Glaser, Robert | 1 |
Lane, Suzanne | 1 |
Lewis, Daniel | 1 |
Llosa, Lorena | 1 |
More ▼ |
Publication Type
Journal Articles | 13 |
Reports - Research | 13 |
Reports - Evaluative | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Elementary Secondary Education | 1 |
Audience
Teachers | 1 |
Location
California | 1 |
China | 1 |
New Hampshire | 1 |
United Kingdom | 1 |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Wind, Stefanie A. – Educational Measurement: Issues and Practice, 2020
Researchers have documented the impact of rater effects, or raters' tendencies to give different ratings than would be expected given examinee achievement levels, in performance assessments. However, the degree to which rater effects influence person fit, or the reasonableness of test-takers' achievement estimates given their response patterns,…
Descriptors: Performance Based Assessment, Evaluators, Achievement, Influences
Lewis, Daniel; Cook, Robert – Educational Measurement: Issues and Practice, 2020
In this paper we assert that the practice of principled assessment design renders traditional standard-setting methodology redundant at best and contradictory at worst. We describe the rationale for, and methodological details of, Embedded Standard Setting (ESS; previously, Engineered Cut Scores. Lewis, 2016), an approach to establish performance…
Descriptors: Standard Setting, Evaluation, Cutting Scores, Performance Based Assessment
Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021
Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…
Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making
Rubright, Jonathan D. – Educational Measurement: Issues and Practice, 2018
Performance assessments, scenario-based tasks, and other groups of items carry a risk of violating the local item independence assumption made by unidimensional item response theory (IRT) models. Previous studies have identified negative impacts of ignoring such violations, most notably inflated reliability estimates. Still, the influence of this…
Descriptors: Performance Based Assessment, Item Response Theory, Models, Test Reliability
Wind, Stefanie A.; Schumacker, Randall E. – Educational Measurement: Issues and Practice, 2017
The term measurement disturbance has been used to describe systematic conditions that affect a measurement process, resulting in a compromised interpretation of person or item estimates. Measurement disturbances have been discussed in relation to systematic response patterns associated with items and persons, such as start-up, plodding, boredom,…
Descriptors: Measurement, Testing Problems, Writing Tests, Performance Based Assessment
Evans, Carla M.; Lyons, Susan – Educational Measurement: Issues and Practice, 2017
The purpose of this study was to test methods that strengthen the comparability claims about annual determinations of student proficiency in English language arts, math, and science (Grades 3-12) in the New Hampshire Performance Assessment of Competency Education (NH PACE) pilot project. First, we examined the literature in order to define…
Descriptors: Academic Achievement, Language Arts, Mathematics Achievement, Science Achievement
Wolf, Mikyung Kim; Faulkner-Bond, Molly – Educational Measurement: Issues and Practice, 2016
States use standards-based English language proficiency (ELP) assessments to inform relatively high-stakes decisions for English learner (EL) students. Results from these assessments are one of the primary criteria used to determine EL students' level of ELP and readiness for reclassification. The results are also used to evaluate the…
Descriptors: High Stakes Tests, Language Proficiency, Hierarchical Linear Modeling, Scores
Crisp, Victoria – Educational Measurement: Issues and Practice, 2012
In the United Kingdom, the majority of national assessments involve human raters. The processes by which raters determine the scores to award are central to the assessment process and affect the extent to which valid inferences can be made from assessment outcomes. Thus, understanding rater cognition has become a growing area of research in the…
Descriptors: Foreign Countries, Scores, Protocol Analysis, Social Influences
Llosa, Lorena – Educational Measurement: Issues and Practice, 2008
Using an argument-based approach to validation, this study examines the quality of teacher judgments in the context of a standards-based classroom assessment of English proficiency. Using Bachman's (2005) assessment use argument (AUA) as a framework for the investigation, this paper first articulates the claims, warrants, rebuttals, and backing…
Descriptors: Protocol Analysis, Multitrait Multimethod Techniques, Validity, Scoring

Lane, Suzanne; And Others – Educational Measurement: Issues and Practice, 1996
Gender-related differential item functioning (DIF) was examined in a context in which 3,946 middle school students received mathematics instruction focusing on problem solving. Reasons why four tasks on the performance assessment favored female students and two favored male students are discussed. (SLD)
Descriptors: Item Bias, Mathematics Achievement, Mathematics Tests, Middle School Students

Baxter, Gail P.; Glaser, Robert – Educational Measurement: Issues and Practice, 1998
An analytic framework is presented for examining the properties and objectives of assessments and scoring systems piloted in a number of state and district testing programs. The article explores the application of this framework, which considers the cognitive components of competence described in studies of expertise, to the analysis of science…
Descriptors: Cognitive Style, Competence, Intellectual Disciplines, Models

Moss, Pamela A.; And Others – Educational Measurement: Issues and Practice, 1992
How portfolio-based conclusions about student learning might be used to communicate with audiences outside the classroom is explored. Procedures and criteria for investigating validity in portfolio use are discussed. Ten portfolios from an eighth grade language arts class are used as an example. (SLD)
Descriptors: Accountability, Data Interpretation, Educational Assessment, Evaluation Criteria

Cai, Jinfa – Educational Measurement: Issues and Practice, 1997
The contributions of open-ended tasks in examining students' mathematical performance were studied with 250 U.S. and 425 Chinese sixth graders. Open-ended tasks allow for analysis of student performance that cannot be assessed solely by percent correct or incorrect, but they pose many problems, such as those of translation. (SLD)
Descriptors: Cognitive Processes, Computation, Cross Cultural Studies, Elementary School Students