Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 10 |
Descriptor
Generalizability Theory | 13 |
Test Items | 13 |
Test Reliability | 13 |
Foreign Countries | 4 |
Scores | 4 |
Test Construction | 4 |
Grade 8 | 3 |
Item Response Theory | 3 |
Test Bias | 3 |
Test Theory | 3 |
Difficulty Level | 2 |
More ▼ |
Source
Author
Alonzo, Julie | 1 |
Anderson, Daniel | 1 |
Atilgan, Hakan | 1 |
Barbera, Jack | 1 |
Basokcu, Tahsin Oguz | 1 |
Bordage, Georges | 1 |
Brennan, Robert L. | 1 |
Brooks, William S. | 1 |
Byram, Jessica N. | 1 |
Conger, Anthony J. | 1 |
Daniels, Vijay J. | 1 |
More ▼ |
Publication Type
Reports - Research | 12 |
Journal Articles | 10 |
Numerical/Quantitative Data | 1 |
Reports - Evaluative | 1 |
Education Level
Higher Education | 5 |
Postsecondary Education | 4 |
Elementary Education | 3 |
Grade 8 | 3 |
Junior High Schools | 3 |
Middle Schools | 3 |
Secondary Education | 3 |
Grade 7 | 2 |
Grade 6 | 1 |
Grade 9 | 1 |
Intermediate Grades | 1 |
More ▼ |
Audience
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Brennan, Robert L.; Kim, Stella Y.; Lee, Won-Chan – Educational and Psychological Measurement, 2022
This article extends multivariate generalizability theory (MGT) to tests with different random-effects designs for each level of a fixed facet. There are numerous situations in which the design of a test and the resulting data structure are not definable by a single design. One example is mixed-format tests that are composed of multiple-choice and…
Descriptors: Multivariate Analysis, Generalizability Theory, Multiple Choice Tests, Test Construction
Deniz, Kaan Zulfikar; Ilican, Emel – International Journal of Assessment Tools in Education, 2021
This study aims to compare the G and Phi coefficients as estimated by D studies for a measurement tool with the G and Phi coefficients obtained from real cases in which items of differing difficulty levels were added and also to determine the conditions under which the D studies estimated reliability coefficients closer to reality. The study group…
Descriptors: Generalizability Theory, Test Items, Difficulty Level, Test Reliability
Atilgan, Hakan; Demir, Elif Kübra; Ogretmen, Tuncay; Basokcu, Tahsin Oguz – International Journal of Progressive Education, 2020
It has become a critical question what the reliability level would be when open-ended questions are used in large-scale selection tests. One of the aims of the present study is to determine what the reliability would be in the event that the answers given by test-takers are scored by experts when open-ended short answer questions are used in…
Descriptors: Foreign Countries, Secondary School Students, Test Items, Test Reliability
Zaidi, Nikki L.; Swoboda, Christopher M.; Kelcey, Benjamin M.; Manuel, R. Stephen – Advances in Health Sciences Education, 2017
The extant literature has largely ignored a potentially significant source of variance in multiple mini-interview (MMI) scores by "hiding" the variance attributable to the sample of attributes used on an evaluation form. This potential source of hidden variance can be defined as rating items, which typically comprise an MMI evaluation…
Descriptors: Interviews, Scores, Generalizability Theory, Monte Carlo Methods
Byram, Jessica N.; Seifert, Mark F.; Brooks, William S.; Fraser-Cotlin, Laura; Thorp, Laura E.; Williams, James M.; Wilson, Adam B. – Anatomical Sciences Education, 2017
With integrated curricula and multidisciplinary assessments becoming more prevalent in medical education, there is a continued need for educational research to explore the advantages, consequences, and challenges of integration practices. This retrospective analysis investigated the number of items needed to reliably assess anatomical knowledge in…
Descriptors: Anatomy, Science Tests, Test Items, Test Reliability
Teker, Gulsen Tasdelen; Dogan, Nuri – Educational Sciences: Theory and Practice, 2015
Reliability and differential item functioning (DIF) analyses were conducted on testlets displaying local item dependence in this study. The data set employed in the research was obtained from the answers given by 1,500 students to the 20 items included in six testlets given in English Proficiency Exam by the School of Foreign Languages of a state…
Descriptors: Foreign Countries, Test Items, Test Bias, Item Response Theory
Daniels, Vijay J.; Bordage, Georges; Gierl, Mark J.; Yudkowsky, Rachel – Advances in Health Sciences Education, 2014
Objective structured clinical examinations (OSCEs) are used worldwide for summative examinations but often lack acceptable reliability. Research has shown that reliability of scores increases if OSCE checklists for medical students include only clinically relevant items. Also, checklists are often missing evidence-based items that high-achieving…
Descriptors: Graduate Medical Education, Check Lists, Scores, Internal Medicine
Wren, David; Barbera, Jack – Chemistry Education Research and Practice, 2014
Assessing conceptual understanding of foundational topics before instruction on higher-order concepts can provide chemical educators with information to aid instructional design. This study provides an instrument that can be used to identify students' alternative conceptions regarding thermochemistry concepts. The Thermochemistry Concept Inventory…
Descriptors: Psychometrics, Thermodynamics, Chemistry, Item Response Theory
Anderson, Daniel; Alonzo, Julie; Tindal, Gerald – Behavioral Research and Teaching, 2012
In this technical report, we describe the results of a study of mathematics items written to align with the Common Core State Standards (CCSS) in grades 6-8. In each grade, CCSS items were organized into forms, and the reliability of these forms was evaluated along with an experimental form including items aligned with the National Council of…
Descriptors: Curriculum Based Assessment, Mathematics Tests, Academic Standards, State Standards
Guler, Nese; Gelbal, Selahattin – Educational Sciences: Theory and Practice, 2010
In this study, the Classical test theory and generalizability theory were used for determination to reliability of scores obtained from measurement tool of mathematics success. 24 open-ended mathematics question of the TIMSS-1999 was applied to 203 students in 2007-spring semester. Internal consistency of scores was found as 0.92. For…
Descriptors: Generalizability Theory, Test Theory, Test Reliability, Interrater Reliability

Conger, Anthony J. – Educational and Psychological Measurement, 1983
A paradoxical phenomenon of decreases in reliability as the number of elements averaged over increases is shown to be possible in multifacet reliability procedures (intraclass correlations or generalizability coefficients). Conditions governing this phenomenon are presented along with implications and cautions. (Author)
Descriptors: Generalizability Theory, Test Construction, Test Items, Test Length
van Weeren, J.; Theunissen, T. J. J. M. – 1986
Pronunciation is regarded as a valuable subskill in foreign language teaching and testing. Its quality is commonly assessed in a global way by having examinees read aloud. An atomistic test is a more systematic and explicit approach. Such a test would consist of about 40 items, use recorded performances, and draw on an inventory of pronunciation…
Descriptors: Audiotape Recordings, Error Patterns, French, Generalizability Theory
Gonzalez-Tamayo, Eulogio – 1987
The concepts of universe of admissible observation and universe of generalization from the generalizability theory were applied to calculate the intraclass correlation coefficient of a licensure test. The internal consistency coefficient of a dichotomously scored test is identical to the intraclass correlation coefficient of a two-facet design.…
Descriptors: Adults, Analysis of Variance, Content Validity, Criterion Referenced Tests