Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 12 |
Descriptor
Source
Author
Lee, Guemin | 4 |
Brennan, Robert L. | 2 |
Frisbie, David A. | 2 |
Valiga, Michael J. | 2 |
Arnold, Margery E. | 1 |
Axtell, Philip K. | 1 |
Baer, John | 1 |
Baker, Eva L. | 1 |
Bost, James E. | 1 |
Burch, V. C. | 1 |
Campbell, Kathleen Taylor | 1 |
More ▼ |
Publication Type
Reports - Evaluative | 28 |
Journal Articles | 20 |
Speeches/Meeting Papers | 7 |
Numerical/Quantitative Data | 2 |
Information Analyses | 1 |
Opinion Papers | 1 |
Education Level
Higher Education | 2 |
Postsecondary Education | 2 |
Elementary Education | 1 |
Elementary Secondary Education | 1 |
Grade 3 | 1 |
Grade 7 | 1 |
Grade 8 | 1 |
Audience
Researchers | 1 |
Location
Iowa | 2 |
South Africa | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Vispoel, Walter P.; Tao, Shuqin – Psychological Assessment, 2013
Our goal in this investigation was to evaluate the reliability of scores from the Balanced Inventory of Desirable Responding (BIDR) more comprehensively than in prior research using a generalizability-theory framework based on both dichotomous and polytomous scoring of items. Generalizability coefficients accounting for specific-factor, transient,…
Descriptors: Reliability, Scores, Measures (Individuals), Generalizability Theory
Lee, Guemin; Park, In-Yong – Asia Pacific Education Review, 2012
Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…
Descriptors: Generalizability Theory, Simulation, Computation, Item Response Theory
Haertel, Edward H. – Educational Testing Service, 2013
Policymakers and school administrators have embraced value-added models of teacher effectiveness as tools for educational improvement. Teacher value-added estimates may be viewed as complicated scores of a certain kind. This suggests using a test validation model to examine their reliability and validity. Validation begins with an interpretive…
Descriptors: Reliability, Validity, Inferences, Teacher Effectiveness
Jeon, Min-Jeong; Lee, Guemin; Hwang, Jeong-Won; Kang, Sang-Jin – Asia Pacific Education Review, 2009
The purpose of this study was to investigate the methods of estimating the reliability of school-level scores using generalizability theory and multilevel models. Two approaches, "student within schools" and "students within schools and subject areas," were conceptualized and implemented in this study. Four methods resulting from the combination…
Descriptors: Generalizability Theory, Scores, Reliability, Statistical Analysis
Kim, Youn-Hee – Applied Linguistics, 2009
The current status of English as an international language has come with challenges to the native speaker norms and raised the relevance of localized varieties in language assessment. This preliminary study investigates whether native English-speaking (NS) and non-native English-speaking (NNS) raters differ in their effect on score reliability in…
Descriptors: Generalizability Theory, Speech Communication, Native Speakers, English (Second Language)
Sijtsma, Klaas – International Journal of Testing, 2009
This article reviews three topics from test theory that continue to raise discussion and controversy and capture test theorists' and constructors' interest. The first topic concerns the discussion of the methodology of investigating and establishing construct validity; the second topic concerns reliability and its misuse, alternative definitions…
Descriptors: Construct Validity, Reliability, Classification, Test Theory
Kaufman, James C.; Lee, Joohyun; Baer, John; Lee, Soonmook – Thinking Skills and Creativity, 2007
The consensual assessment technique (CAT) is a measurement tool for creativity research in which appropriate experts evaluate creative products [Amabile, T. M. (1996). "Creativity in context: Update to the social psychology of creativity." Boulder, CO: Westview]. However, the CAT is hampered by the time-consuming nature of the products (asking…
Descriptors: Creativity, Reliability, Generalizability Theory, Measurement Techniques
Sung, Yao-Ting; Chang, Kuo-En; Chang, Tzyy-Hua; Yu, Wen-Cheng – Journal of Adolescence, 2010
Self- and peer assessments are becoming more popular in classrooms, but there are few data on the reliability and validity of such assessments performed by school children. Because these factors are greatly affected by the number of raters, we conducted two studies to determine the rating behaviours of teenagers in self- and peer assessments, and…
Descriptors: Generalizability Theory, Peer Evaluation, Validity, Reliability
Hagemann, Dirk; Meyerhoff, David – Structural Equation Modeling: A Multidisciplinary Journal, 2008
The latent state-trait (LST) theory is an extension of the classical test theory that allows one to decompose a test score into a true trait, a true state residual, and an error component. For practical applications, the variances of these latent variables may be estimated with standard methods of structural equation modeling (SEM). These…
Descriptors: Structural Equation Models, Test Theory, Reliability, Sample Size
Burch, V. C.; Norman, G. R.; Schmidt, H. G.; van der Vleuten, C. P. M. – Advances in Health Sciences Education, 2008
High stakes postgraduate specialist certification examinations have considerable implications for the future careers of examinees. Medical colleges and professional boards have a social and professional responsibility to ensure their fitness for purpose. To date there is a paucity of published data about the reliability of specialist certification…
Descriptors: Generalizability Theory, Physicians, Foreign Countries, Specialists
Solano-Flores, Guillermo – Educational Researcher, 2008
The testing of English language learners (ELLs) is, to a large extent, a random process because of poor implementation and factors that are uncertain or beyond control. Yet current testing practices and policies appear to be based on deterministic views of language and linguistic groups and erroneous assumptions about the capacity of assessment…
Descriptors: Generalizability Theory, Testing, Second Language Learning, Error of Measurement
Martinez, Jose Felipe; Goldschmidt, Pete; Niemi, David; Baker, Eva L.; Sylvester, Roxanne M. – Educational Assessment, 2007
We conducted generalizability studies to examine the extent to which ratings of language arts performance assignments, administered in a large, diverse, urban district to students in second through ninth grades, result in reliable and precise estimates of true student performance. The results highlight three important points when considering the…
Descriptors: Assignments, Language Arts, Academic Achievement, Urban Areas
Arnold, Margery E. – Research in the Schools, 1996
This paper explains how different factors affect classical reliability estimates, such as test-retest, interrater, internal consistency, and equivalent forms coefficients. The limitations of classical test theory are explored, and the advantages of generalizability theory are discussed. Concrete examples are used. (SLD)
Descriptors: Estimation (Mathematics), Generalizability Theory, Reliability, Test Theory
Loftin, Lynn B. – 1991
Cross-validation, an economical method for assessing whether sample results will generalize, is discussed in this paper. Cross-validation is an invariance technique that uses two subsets of the data sample to derive discriminant function coefficients. The two sets of coefficients are then used with each data subset to derive discriminant function…
Descriptors: Computer Simulation, Discriminant Analysis, Generalizability Theory, Mathematical Models

Lee, Guemin; Frisbie, David A. – Applied Measurement in Education, 1999
Studied the appropriateness and implications of using a generalizability theory approach to estimating the reliability of scores from tests composed of testlets. Analyses of data from two national standardization samples suggest that manipulating the number of passages is a more productive way to obtain efficient measurement than manipulating the…
Descriptors: Generalizability Theory, Models, National Surveys, Reliability
Previous Page | Next Page ยป
Pages: 1 | 2