Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 7 |
Descriptor
Generalizability Theory | 12 |
Models | 12 |
Scores | 12 |
Error of Measurement | 5 |
Reliability | 5 |
Test Items | 3 |
Academic Achievement | 2 |
Comparative Analysis | 2 |
Educational Testing | 2 |
Evaluation | 2 |
Interrater Reliability | 2 |
More ▼ |
Source
Author
Lee, Guemin | 4 |
Clauser, Brian E. | 2 |
Arendasy, Martin E. | 1 |
Boyd, Donald | 1 |
Custer, Michael | 1 |
Fisher, Steven P. | 1 |
Frisbie, David A. | 1 |
Furman, Gail E. | 1 |
Gordon, Belita | 1 |
Grabovsky, Irina | 1 |
Grossman, Pamela | 1 |
More ▼ |
Publication Type
Journal Articles | 9 |
Reports - Research | 7 |
Reports - Evaluative | 4 |
Speeches/Meeting Papers | 2 |
Reports - Descriptive | 1 |
Education Level
Elementary Secondary Education | 1 |
Higher Education | 1 |
Postsecondary Education | 1 |
Audience
Location
New York | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Sample Size and Item Parameter Estimation Precision When Utilizing the Masters' Partial Credit Model
Custer, Michael; Kim, Jongpil – Online Submission, 2023
This study utilizes an analysis of diminishing returns to examine the relationship between sample size and item parameter estimation precision when utilizing the Masters' Partial Credit Model for polytomous items. Item data from the standardization of the Batelle Developmental Inventory, 3rd Edition were used. Each item was scored with a…
Descriptors: Sample Size, Item Response Theory, Test Items, Computation
Zaidi, Nikki L.; Swoboda, Christopher M.; Kelcey, Benjamin M.; Manuel, R. Stephen – Advances in Health Sciences Education, 2017
The extant literature has largely ignored a potentially significant source of variance in multiple mini-interview (MMI) scores by "hiding" the variance attributable to the sample of attributes used on an evaluation form. This potential source of hidden variance can be defined as rating items, which typically comprise an MMI evaluation…
Descriptors: Interviews, Scores, Generalizability Theory, Monte Carlo Methods
Arendasy, Martin E.; Sommer, Markus – Intelligence, 2013
Allowing respondents to retake a cognitive ability test has shown to increase their test scores. Several theoretical models have been proposed to explain this effect, which make distinct assumptions regarding the measurement invariance of psychometric tests across test administration sessions with regard to narrower cognitive abilities and general…
Descriptors: Cognitive Tests, Testing, Repetition, Scores
Raymond, Mark R.; Clauser, Brian E.; Furman, Gail E. – Advances in Health Sciences Education, 2010
The use of standardized patients to assess communication skills is now an essential part of assessing a physician's readiness for practice. To improve the reliability of communication scores, it has become increasingly common in recent years to use statistical models to adjust ratings provided by standardized patients. This study employed ordinary…
Descriptors: Generalizability Theory, Physicians, Patients, Least Squares Statistics
Jeon, Min-Jeong; Lee, Guemin; Hwang, Jeong-Won; Kang, Sang-Jin – Asia Pacific Education Review, 2009
The purpose of this study was to investigate the methods of estimating the reliability of school-level scores using generalizability theory and multilevel models. Two approaches, "student within schools" and "students within schools and subject areas," were conceptualized and implemented in this study. Four methods resulting from the combination…
Descriptors: Generalizability Theory, Scores, Reliability, Statistical Analysis
Harik, Polina; Clauser, Brian E.; Grabovsky, Irina; Nungester, Ronald J.; Swanson, Dave; Nandakumar, Ratna – Journal of Educational Measurement, 2009
The present study examined the long-term usefulness of estimated parameters used to adjust the scores from a performance assessment to account for differences in rater stringency. Ratings from four components of the USMLE[R] Step 2 Clinical Skills Examination data were analyzed. A generalizability-theory framework was used to examine the extent to…
Descriptors: Generalizability Theory, Performance Based Assessment, Performance Tests, Clinical Experience

Lee, Guemin; Frisbie, David A. – Applied Measurement in Education, 1999
Studied the appropriateness and implications of using a generalizability theory approach to estimating the reliability of scores from tests composed of testlets. Analyses of data from two national standardization samples suggest that manipulating the number of passages is a more productive way to obtain efficient measurement than manipulating the…
Descriptors: Generalizability Theory, Models, National Surveys, Reliability

Lee, Guemin – Journal of Educational Measurement, 2002
Studied the effects of items, passages, contents, themes, and types of passages on the reliability and standard errors of measurement for complex reading comprehension tests using seven different generalizability theory models. Results suggest that passages and themes should be taken into account when evaluating the reliability of test scores for…
Descriptors: Error of Measurement, Generalizability Theory, Models, Reading Comprehension
Lee, Guemin – 2000
The purpose of this study was to investigate the relative appropriateness of several procedures for estimating reliability and standard errors of measurement of complex reading comprehension tests. Seven generalizability theory models were conceptualized by incorporating one or several factors of items, passages, themes, contents, and types of…
Descriptors: Error of Measurement, Estimation (Mathematics), Generalizability Theory, Models

Longford, N. T. – Journal of Educational and Behavioral Statistics, 1994
Presents a model-based approach to rater reliability for essays read by multiple raters. The approach is motivated by generalizability theory, and variation of rater severity and rater inconsistency is considered in the presence of between-examinee variations. Illustrates methods with data from standardized educational tests. (Author/SLD)
Descriptors: Educational Testing, Essay Tests, Generalizability Theory, Interrater Reliability
Johnson, Robert L.; Penny, James; Gordon, Belita; Shumate, Steven R.; Fisher, Steven P. – Language Assessment Quarterly, 2005
Many studies have indicated that at least 2 raters should score writing assessments to improve interrater reliability. However, even for assessments that characteristically demonstrate high levels of rater agreement, 2 raters of the same essay can occasionally report different, or discrepant, scores. If a single score, typically referred to as an…
Descriptors: Interrater Reliability, Scores, Evaluation, Reliability
Boyd, Donald; Grossman, Pamela; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – National Center for Analysis of Longitudinal Data in Education Research, 2008
Value-added models in education research allow researchers to explore how a wide variety of policies and measured school inputs affect the academic performance of students. Researchers typically quantify the impacts of such interventions in terms of "effect sizes", i.e., the estimated effect of a one standard deviation change in the…
Descriptors: Credentials, Teacher Effectiveness, Models, Teacher Qualifications