NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 11 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Rupp, André A. – Applied Measurement in Education, 2018
This article discusses critical methodological design decisions for collecting, interpreting, and synthesizing empirical evidence during the design, deployment, and operational quality-control phases for automated scoring systems. The discussion is inspired by work on operational large-scale systems for automated essay scoring but many of the…
Descriptors: Design, Automation, Scoring, Test Scoring Machines
Peer reviewed Peer reviewed
Direct linkDirect link
Aydin, Utkun; Ubuz, Behiye – International Journal of Science and Mathematics Education, 2015
Two studies were conducted for the development and validation of a multidimensional test to assess undergraduate students' mathematical thinking about derivative. The first study involved two phases: question generation and refinement of the Thinking-about-Derivative Test (TDT). The second study included four phases as follows: test…
Descriptors: Undergraduate Students, Mathematics Education, Mathematical Concepts, Knowledge Level
Phillips, Gary W., Ed. – 1996
Recently, there has been a significant expansion in the use of performance assessment in large scale testing programs. Although there has been significant support from curriculum and policy stakeholders, the technical feasibility of large scale performance assessments has remained a question. This report is intended to contribute to the debate by…
Descriptors: Comparative Analysis, Generalizability Theory, Performance Based Assessment, Psychometrics
Reckase, Mark D. – 1997
This paper argues that special procedures for constructing assessment tools containing performance assessment tasks are unnecessary and that current test methodology can easily be generalized to complex performance assessment tasks without destroying the desirable characteristics of those tasks. Reasonable statistical requirements for sound…
Descriptors: Educational Assessment, Generalizability Theory, High Stakes Tests, Interrater Reliability
Espelage, Dorothy L.; Quittner, Alexandra L.; Kamps, Jodi – 1998
Generalizability theory (g-theory) was used, as an alternative to classical test theory, to evaluate measurement error in a behaviorally anchored role-play measure, highlighting the usefulness of this theory in instrument development. G-theory partitions an observed score into the universe score and error scores associated with separate sources of…
Descriptors: Behavior Patterns, Eating Disorders, Error of Measurement, Females
Warm, Ronnie; And Others – 1986
This document describes the development and assessment of a methodology for generating on-the-job-training (OJT) task proficiency assessment instruments. The Task Evaluation Form (TEF) development procedures were derived to address previously identified deficiencies in the evaluation of OJT task proficiency. The TEF development procedures allow…
Descriptors: Adults, Correlation, Data Collection, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Schilling, Stephen – Measurement: Interdisciplinary Research and Perspectives, 2007
In this article, the author echoes his co-author's and colleague's pleasure (Hill, this issue) at the thoughtfulness and far-ranging nature of the comments to their initial attempts at test validation for the mathematical knowledge for teaching (MKT) measures using the validity argument approach. Because of the large number of commentaries they…
Descriptors: Generalizability Theory, Persuasive Discourse, Educational Testing, Measurement
Secolsky, Charles, Ed.; Denison, D. Brian, Ed. – Routledge, Taylor & Francis Group, 2011
Increased demands for colleges and universities to engage in outcomes assessment for accountability purposes have accelerated the need to bridge the gap between higher education practice and the fields of measurement, assessment, and evaluation. The "Handbook on Measurement, Assessment, and Evaluation in Higher Education" provides higher…
Descriptors: Generalizability Theory, Higher Education, Institutional Advancement, Teacher Effectiveness
van Weeren, J.; Theunissen, T. J. J. M. – 1986
Pronunciation is regarded as a valuable subskill in foreign language teaching and testing. Its quality is commonly assessed in a global way by having examinees read aloud. An atomistic test is a more systematic and explicit approach. Such a test would consist of about 40 items, use recorded performances, and draw on an inventory of pronunciation…
Descriptors: Audiotape Recordings, Error Patterns, French, Generalizability Theory
Micceri, Theodore – 1984
This paper investigates the reliability of the Florida Performance Measurement Systems' Summative Observation instrument. Developed for the Florida Beginning Teacher Evaluation Program, it provides behavioral ratings for teachers in a classroom setting. Data came from ratings of videotapes of nine teachers conducting actual lessons by nine teams…
Descriptors: Analysis of Variance, Classroom Observation Techniques, Elementary Secondary Education, Evaluation Methods
Gipps, Caroline V. – 1994
The teacher assessment that is the subject of this paper is an essentially informal activity. The teacher assesses the student by posing questions, observing activities, and evaluating work in a planned or ad hoc way. The information obtained may be partial or fragmented, but repeating such assessments over time will allow the buildup of a solid…
Descriptors: Academic Achievement, Educational Assessment, Elementary Secondary Education, Evaluation Methods