NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 11 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Myford, Carol M.; Wolfe, Edward W. – Journal of Educational Measurement, 2009
In this study, we describe a framework for monitoring rater performance over time. We present several statistical indices to identify raters whose standards drift and explain how to use those indices operationally. To illustrate the use of the framework, we analyzed rating data from the 2002 Advanced Placement English Literature and Composition…
Descriptors: English Literature, Advanced Placement, Measures (Individuals), Writing (Composition)
Peer reviewed Peer reviewed
Direct linkDirect link
Wolfe, Edward W.; Viger, Steven G.; Jarvinen, Denis W.; Linksman, Jay – Educational and Psychological Measurement, 2007
As large-scale accountability testing becomes more refined, statewide standards are being created so that teachers and students can create learning and assessment targets that are aligned with statewide testing systems. An important hurdle in assisting teachers in their efforts to create standards-aligned classroom assessments is creating feelings…
Descriptors: Testing, Professional Development, State Standards, Scores
Peer reviewed Peer reviewed
Wolfe, Edward W. – Journal of Applied Measurement, 2003
Developed a procedure for evaluating item-level nonresponse bias in questionnaire items using logistic regression to determine whether nonresponses are random or systematic in nature for one question from the National Education Longitudinal Study of 1994 concerning drug use behaviors. Identified systematic nonresponses and the magnitude of…
Descriptors: Behavior Patterns, Drug Use, Evaluation Methods, Item Bias
Peer reviewed Peer reviewed
Frederiksen, John R.; Sipusic, Mike; Sherin, Miriam; Wolfe, Edward W. – Educational Assessment, 1998
Developed a video portfolio technique of teacher assessment and evaluated the technique through studies of six teachers and their raters. Results show that teachers are consistent in observing teaching functions and using their observations to evaluate teaching. (SLD)
Descriptors: Evaluation Methods, Interrater Reliability, Portfolio Assessment, Teacher Evaluation
Peer reviewed Peer reviewed
Wolfe, Edward W.; Chiu, Chris W. T. – Journal of Outcome Measurement, 1999
Describes a Rasch rating scale analysis of a multi-occasion evaluation that produces confusing results when subjected to separate calibrations. Applies a correction algorithm developed by B. Wright (1996) to show how the Wright algorithm can reduce misfit to the Rasch rating scale model as well as changing the interpretation of change. (SLD)
Descriptors: Algorithms, Change, Evaluation Methods, Item Response Theory
Wolfe, Edward W.; Kao, Chi-Wen – 1996
This paper reports the results of an analysis of the relationship between scorer behaviors and score variability. Thirty-six essay scorers were interviewed and asked to perform a think-aloud task as they scored 24 essays. Each comment made by a scorer was coded according to its content focus (i.e. appearance, assignment, mechanics, communication,…
Descriptors: Content Analysis, Educational Assessment, Essays, Evaluation Methods
Wolfe, Edward W.; Feltovich, Brian – 1994
This paper presents a model of scored cognition that incorporates two types of mental models: models of performance (i.e., the criteria for judging performance) and models of scoring (i.e., the procedural scripts for scoring an essay). In Study 1, six novice and five experienced scorers wrote definitions of three levels of a 6-point holistic…
Descriptors: Cognitive Processes, Criteria, Essays, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Hickey, Daniel T.; Wolfe, Edward W.; Kindfield, Ann C. H. – Educational Assessment, 2000
Developed a system for assessing students' reasoning proficiency in introductory genetics in the computer-supported environment GenScope (tm). Studied whether the assessment system helped students develop the understanding it was designed to assess. Results from 11 high school classes showing strong evidential validity and limited consequential…
Descriptors: Computer Assisted Instruction, Evaluation Methods, Formative Evaluation, Genetics
Wolfe, Edward W.; Myford, Carol M.; Engelhard, George, Jr.; Manalo, Jonathan R. – College Board, 2007
In this study, we investigated a variety of Reader effects that may influence the validity of ratings assigned to AP® English Literature and Composition essays. Specifically, we investigated whether Readers exhibit changes in their levels of severity and accuracy, and their use of individual scale categories over time. We refer to changes in these…
Descriptors: Advanced Placement Programs, Essays, English Literature, Writing (Composition)
Hickey, Daniel T.; Wolfe, Edward W.; Kindfield, Ann C. H. – 1998
To evaluate student learning in a computer-supported environment known as "GenScope," a system was developed for assessing students' understanding and learning of introductory genetics material presented in two developed GenScope instruments. Both quantitative and qualitative methods were used to address traditional evidential validity…
Descriptors: Computer Assisted Testing, Curriculum Development, Educational Technology, Evaluation Methods
Wolfe, Edward W. – 1996
This paper reports the results of a large-scale portfolio pilot in which over 2,000 secondary students submitted portfolios in language arts, mathematics, and science classes. Students were asked to select work for their portfolios based on the criteria that would be used to evaluate the work. Students also reflected on how their work satisfied…
Descriptors: Criteria, Educational Assessment, Evaluation Methods, Language Arts