Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 5 |
Descriptor
Evaluation Methods | 13 |
Generalizability Theory | 13 |
Performance Based Assessment | 13 |
Test Reliability | 5 |
Student Evaluation | 4 |
Test Construction | 4 |
Educational Assessment | 3 |
Interrater Reliability | 3 |
Reliability | 3 |
Scoring | 3 |
Test Validity | 3 |
More ▼ |
Source
Applied Psychological… | 2 |
Advances in Physiology… | 1 |
Applied Measurement in… | 1 |
Educational Researcher | 1 |
Journal of Special Education | 1 |
Language Testing | 1 |
Pearson | 1 |
Psychometrika | 1 |
Research & Practice in… | 1 |
Author
Brennan, Robert L. | 1 |
Crehan, Kevin D. | 1 |
Garcia, Raymond E. | 1 |
Geller, Josh P. | 1 |
Hambleton, Ronald K. | 1 |
Hathcoat, John D. | 1 |
Keng, Leslie | 1 |
Kim, Joshua M. | 1 |
Krilowicz, Beverly L. | 1 |
Lin, Chih-Kai | 1 |
Linn, Robert L. | 1 |
More ▼ |
Publication Type
Journal Articles | 9 |
Reports - Research | 6 |
Speeches/Meeting Papers | 4 |
Reports - Evaluative | 3 |
Book/Product Reviews | 1 |
Information Analyses | 1 |
Reference Materials -… | 1 |
Reports - General | 1 |
Education Level
Grade 10 | 1 |
Higher Education | 1 |
Postsecondary Education | 1 |
Audience
Practitioners | 1 |
Location
Oklahoma | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Lin, Chih-Kai – Language Testing, 2017
Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…
Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy
Shin, Yongyun; Raudenbush, Stephen W. – Psychometrika, 2012
Social scientists are frequently interested in assessing the qualities of social settings such as classrooms, schools, neighborhoods, or day care centers. The most common procedure requires observers to rate social interactions within these settings on multiple items and then to combine the item responses to obtain a summary measure of setting…
Descriptors: Generalizability Theory, Neighborhoods, Intervals, Child Care Centers
Hathcoat, John D.; Penn, Jeremy D. – Research & Practice in Assessment, 2012
Critics of standardized testing have recommended replacing standardized tests with more authentic assessment measures, such as classroom assignments, projects, or portfolios rated by a panel of raters using common rubrics. Little research has examined the consistency of scores across multiple authentic assignments or the implications of this…
Descriptors: Generalizability Theory, Performance Based Assessment, Writing Across the Curriculum, Standardized Tests
Tindal, Gerald; Yovanoff, Paul; Geller, Josh P. – Journal of Special Education, 2010
Students with significant disabilities must participate in large-scale assessments, often using an alternate assessment judged against alternate achievement standards. The development and administration of this type of assessment must necessarily balance meaningful participation with accurate measurement. In this study, generalizability theory is…
Descriptors: Generalizability Theory, Alternative Assessment, Disabilities, Severe Mental Retardation
Way, Walter D.; Murphy, Daniel; Powers, Sonya; Keng, Leslie – Pearson, 2012
Significant momentum exists for next-generation assessments to increasingly utilize technology to develop and deliver performance-based assessments. Many traditional challenges with this assessment approach still apply, including psychometric concerns related to performance-based tasks (PBTs), which include low reliability, efficiency of…
Descriptors: Task Analysis, Performance Based Assessment, Technology Uses in Education, Models

Brennan, Robert L. – Applied Psychological Measurement, 2000
Reviews relevant aspects of generalizability theory related to performance assessments and discusses the role of various facets in assessing the generalizability of performance assessments. Also considers some popular estimates of reliability for performance assessments from the perspective of generalizability theory. (SLD)
Descriptors: Estimation (Mathematics), Evaluation Methods, Generalizability Theory, Performance Based Assessment

Hambleton, Ronald K. – Applied Psychological Measurement, 2000
Introduces the articles of this theme issue focusing on performance assessment methodology. Papers address: (1) merging item formats; (2) scoring models; (3) equating and linking; (4) generalizability theory; (5) standard setting methods; and (6) validity issues and methods. (SLD)
Descriptors: Equated Scores, Evaluation Methods, Generalizability Theory, Performance Based Assessment
Crehan, Kevin D. – 1997
Writing fits well within the realm of outcomes suitable for observation by performance assessments. Studies of the reliability of performance assessments have suggested that interrater reliability can be consistently high. Scoring consistency, however, is only one aspect of quality in decisions based on assessment results. Another is…
Descriptors: Evaluation Methods, Feedback, Generalizability Theory, Interrater Reliability
Oh, Deborah M.; Kim, Joshua M.; Garcia, Raymond E.; Krilowicz, Beverly L. – Advances in Physiology Education, 2005
There is increasing pressure, both from institutions central to the national scientific mission and from regional and national accrediting agencies, on natural sciences faculty to move beyond course examinations as measures of student performance and to instead develop and use reliable and valid authentic assessment measures for both individual…
Descriptors: Evaluation Methods, Biochemistry, Natural Sciences, Generalizability Theory

Quellmalz, Edys S. – Applied Measurement in Education, 1991
It is proposed that criteria for evaluating the quality of performance should be defined, at least tentatively, during the initial design of a performance assessment. Six characteristics of sound criteria are (1) significance; (2) fidelity; (3) generalizability; (4) developmental appropriateness; (5) accessibility; and (6) utility. (SLD)
Descriptors: Child Development, Cognitive Tests, Educational Assessment, Evaluation Criteria
Northwest Regional Educational Lab., Portland, OR. Center for Performance Assessment. – 1983
This annotated bibliography contains nine items addressing assessment methodology. The titles are: "Performance and Product Evaluation"; "The Critical Incident Technique"; "Constructing Achievement Tests"; "Applying the Assessment Center Method"; "Performance Assessment in Education and Training:…
Descriptors: Achievement Tests, Administrator Evaluation, Annotated Bibliographies, Assessment Centers (Personnel)

Linn, Robert L.; And Others – Educational Researcher, 1991
Increasing emphasis on assessment and concern about assessment techniques have stirred interest in alternative assessment forms, for which evidence is needed about consequences, transfer of performance on specific assessment tests, and assessment fairness. Criteria concerning consequences, fairness, transfer-generalizability, cognitive complexity,…
Descriptors: Achievement Tests, Cost Effectiveness, Educational Assessment, Educational Policy
Wise, Lauress – 1993
Industrial and organizational psychologists for the Department of Defense have been working for the past 10 years to develop high fidelity measures of job performance for use in validating job selection procedures and standards. Information on developing and scoring performance exercises in the Job Performance Measurement (JPM) Project is…
Descriptors: Educational Assessment, Educational Research, Evaluation Methods, Generalizability Theory