ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	4

Descriptor

Generalizability Theory	11
Test Construction	11
Test Validity	11
Test Reliability	7
Evaluation Methods	5
Data Collection	3
Educational Assessment	3
Psychometrics	3
Statistical Analysis	3
Teacher Evaluation	3
Test Use	3
Computer Assisted Testing	2
Decision Making	2
Design	2
Elementary Secondary Education	2
Interrater Reliability	2
Mathematics Education	2
Measurement Techniques	2
Models	2
Performance Based Assessment	2
Reliability	2
Student Evaluation	2
Testing Problems	2
Academic Accommodations…	1
Academic Achievement	1
More ▼

Source

Applied Measurement in…	1
International Journal of…	1
Measurement:…	1
Routledge, Taylor & Francis…	1

Author

Aydin, Utkun	1
Denison, D. Brian, Ed.	1
Espelage, Dorothy L.	1
Gipps, Caroline V.	1
Kamps, Jodi	1
Micceri, Theodore	1
Phillips, Gary W., Ed.	1
Quittner, Alexandra L.	1
Reckase, Mark D.	1
Rupp, André A.	1
Schilling, Stephen	1
Secolsky, Charles, Ed.	1
Theunissen, T. J. J. M.	1
Ubuz, Behiye	1
Warm, Ronnie	1
van Weeren, J.	1
More ▼

Publication Type

Reports - Research	4
Speeches/Meeting Papers	4
Journal Articles	3
Reports - Evaluative	3
Reports - Descriptive	2
Books	1
Collected Works - General	1
Opinion Papers	1

Education Level

Higher Education	2
Elementary Secondary Education	1
Postsecondary Education	1
Secondary Education	1
Two Year Colleges	1

Audience

Researchers

Location

California	1
United Kingdom	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Designing, Evaluating, and Deploying Automated Scoring Systems with Validity in Mind: Methodological Design Decisions

Peer reviewed

Direct link

Rupp, André A. – Applied Measurement in Education, 2018

This article discusses critical methodological design decisions for collecting, interpreting, and synthesizing empirical evidence during the design, deployment, and operational quality-control phases for automated scoring systems. The discussion is inspired by work on operational large-scale systems for automated essay scoring but many of the…

Descriptors: Design, Automation, Scoring, Test Scoring Machines

The Thinking-about-Derivative Test for Undergraduate Students: Development and Validation

Peer reviewed

Direct link

Aydin, Utkun; Ubuz, Behiye – International Journal of Science and Mathematics Education, 2015

Two studies were conducted for the development and validation of a multidimensional test to assess undergraduate students' mathematical thinking about derivative. The first study involved two phases: question generation and refinement of the Thinking-about-Derivative Test (TDT). The second study included four phases as follows: test…

Descriptors: Undergraduate Students, Mathematics Education, Mathematical Concepts, Knowledge Level

Technical Issues in Large-Scale Performance Assessment.

Download full text

Phillips, Gary W., Ed. – 1996

Recently, there has been a significant expansion in the use of performance assessment in large scale testing programs. Although there has been significant support from curriculum and policy stakeholders, the technical feasibility of large scale performance assessments has remained a question. This report is intended to contribute to the debate by…

Descriptors: Comparative Analysis, Generalizability Theory, Performance Based Assessment, Psychometrics

Statistical Test Specifications for Performance Assessments: Is This an Oxymoron?

Download full text

Reckase, Mark D. – 1997

This paper argues that special procedures for constructing assessment tools containing performance assessment tasks are unnecessary and that current test methodology can easily be generalized to complex performance assessment tasks without destroying the desirable characteristics of those tasks. Reasonable statistical requirements for sound…

Descriptors: Educational Assessment, Generalizability Theory, High Stakes Tests, Interrater Reliability

An Application of Generalizability Theory to the Validation of a Behaviorally Anchored Role-Play Measure.

Espelage, Dorothy L.; Quittner, Alexandra L.; Kamps, Jodi – 1998

Generalizability theory (g-theory) was used, as an alternative to classical test theory, to evaluate measurement error in a behaviorally anchored role-play measure, highlighting the usefulness of this theory in instrument development. G-theory partitions an observed score into the universe score and error scores associated with separate sources of…

Descriptors: Behavior Patterns, Eating Disorders, Error of Measurement, Females

On-the-Job Training: Development and Assessment of a Methodology for Generating Task Proficiency Evaluation Instruments.

Warm, Ronnie; And Others – 1986

This document describes the development and assessment of a methodology for generating on-the-job-training (OJT) task proficiency assessment instruments. The Task Evaluation Form (TEF) development procedures were derived to address previously identified deficiencies in the evaluation of OJT task proficiency. The TEF development procedures allow…

Descriptors: Adults, Correlation, Data Collection, Evaluation Methods

Generalizability and Specificity of Interpretive Arguments: Observations Inspired by the Commentaries

Peer reviewed

Direct link

Schilling, Stephen – Measurement: Interdisciplinary Research and Perspectives, 2007

In this article, the author echoes his co-author's and colleague's pleasure (Hill, this issue) at the thoughtfulness and far-ranging nature of the comments to their initial attempts at test validation for the mathematical knowledge for teaching (MKT) measures using the validity argument approach. Because of the large number of commentaries they…

Descriptors: Generalizability Theory, Persuasive Discourse, Educational Testing, Measurement

Handbook on Measurement, Assessment, and Evaluation in Higher Education

Direct link

Secolsky, Charles, Ed.; Denison, D. Brian, Ed. – Routledge, Taylor & Francis Group, 2011

Increased demands for colleges and universities to engage in outcomes assessment for accountability purposes have accelerated the need to bridge the gap between higher education practice and the fields of measurement, assessment, and evaluation. The "Handbook on Measurement, Assessment, and Evaluation in Higher Education" provides higher…

Descriptors: Generalizability Theory, Higher Education, Institutional Advancement, Teacher Effectiveness

Testing Pronunciation: An Application of Generalizability Theory.

van Weeren, J.; Theunissen, T. J. J. M. – 1986

Pronunciation is regarded as a valuable subskill in foreign language teaching and testing. Its quality is commonly assessed in a global way by having examinees read aloud. An atomistic test is a more systematic and explicit approach. Such a test would consist of about 40 items, use recorded performances, and draw on an inventory of pronunciation…

Descriptors: Audiotape Recordings, Error Patterns, French, Generalizability Theory

Establishing the Reliability of the Florida Performance Measurement System's Research Based Observation Instrument.

Download full text

Micceri, Theodore – 1984

This paper investigates the reliability of the Florida Performance Measurement Systems' Summative Observation instrument. Developed for the Florida Beginning Teacher Evaluation Program, it provides behavioral ratings for teachers in a classroom setting. Data came from ratings of videotapes of nine teachers conducting actual lessons by nine teams…

Descriptors: Analysis of Variance, Classroom Observation Techniques, Elementary Secondary Education, Evaluation Methods

Quality Assurance in Teachers' Assessment.

Download full text

Gipps, Caroline V. – 1994

The teacher assessment that is the subject of this paper is an essentially informal activity. The teacher assesses the student by posing questions, observing activities, and evaluating work in a planned or ad hoc way. The information obtained may be partial or fragmented, but repeating such assessments over time will allow the buildup of a solid…

Descriptors: Academic Achievement, Educational Assessment, Elementary Secondary Education, Evaluation Methods