Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 6 |
Since 2006 (last 20 years) | 17 |
Descriptor
Generalizability Theory | 49 |
Test Construction | 49 |
Test Reliability | 15 |
Performance Based Assessment | 14 |
Evaluation Methods | 12 |
Educational Assessment | 11 |
Test Items | 11 |
Test Validity | 11 |
Scores | 10 |
Error of Measurement | 8 |
Psychometrics | 7 |
More ▼ |
Source
Author
Brennan, Robert L. | 2 |
Chang, Lei | 2 |
Kantor, Robert | 2 |
Lee, Yong-Won | 2 |
Linn, Robert L. | 2 |
Mollaun, Pam | 2 |
Solano-Flores, Guillermo | 2 |
Aydin, Utkun | 1 |
Bacharach, Verne R. | 1 |
Bell, John F. | 1 |
Burton, Elizabeth | 1 |
More ▼ |
Publication Type
Education Level
Higher Education | 4 |
Secondary Education | 4 |
Elementary Secondary Education | 2 |
Grade 10 | 2 |
Grade 8 | 2 |
Junior High Schools | 2 |
Middle Schools | 2 |
Adult Education | 1 |
Elementary Education | 1 |
Grade 3 | 1 |
Grade 4 | 1 |
More ▼ |
Audience
Researchers | 2 |
Practitioners | 1 |
Students | 1 |
Location
California | 2 |
United Kingdom | 2 |
Germany | 1 |
Netherlands | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 3 |
Test of English for… | 2 |
ACT Assessment | 1 |
Eysenck Personality Inventory | 1 |
Texas Assessment of Academic… | 1 |
What Works Clearinghouse Rating
Brennan, Robert L.; Kim, Stella Y.; Lee, Won-Chan – Educational and Psychological Measurement, 2022
This article extends multivariate generalizability theory (MGT) to tests with different random-effects designs for each level of a fixed facet. There are numerous situations in which the design of a test and the resulting data structure are not definable by a single design. One example is mixed-format tests that are composed of multiple-choice and…
Descriptors: Multivariate Analysis, Generalizability Theory, Multiple Choice Tests, Test Construction
Schmidgall, Jonathan E. – ETS Research Report Series, 2017
This report briefly reviews the design and scoring procedure for the "TOEIC"® Speaking test and summarizes existing evidence about the consistency of TOEIC Speaking test scores. It then describes several analyses conducted using generalizability theory to provide additional information about the consistency of scores across different…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Speech Tests
Jensen, Bryant; Grajeda, Sara; Haertel, Edward – Educational Assessment, 2018
We trace the development and analyze the generalizability of the Classroom Assessment of Sociocultural Interactions (CASI), an observation system designed to measure cultural dimensions of classroom interactions. We establish CASI measurement properties by analyzing panoramic videos of 4th and 5th grade classrooms from the Measures of Effective…
Descriptors: Classroom Observation Techniques, Grade 4, Grade 5, Error of Measurement
Li, Feifei – ETS Research Report Series, 2017
An information-correction method for testlet-based tests is introduced. This method takes advantage of both generalizability theory (GT) and item response theory (IRT). The measurement error for the examinee proficiency parameter is often underestimated when a unidimensional conditional-independence IRT model is specified for a testlet dataset. By…
Descriptors: Item Response Theory, Generalizability Theory, Tests, Error of Measurement
Rupp, André A. – Applied Measurement in Education, 2018
This article discusses critical methodological design decisions for collecting, interpreting, and synthesizing empirical evidence during the design, deployment, and operational quality-control phases for automated scoring systems. The discussion is inspired by work on operational large-scale systems for automated essay scoring but many of the…
Descriptors: Design, Automation, Scoring, Test Scoring Machines
Peoples, Shelagh M. – AERA Online Paper Repository, 2016
High school graduation is "not yet a reliable indicator of college readiness", (Gaertner & McClarty, 2015, p2). As such, researchers are investigating the use of non-cognitive factors as predictors of College and Career Readiness (CCR). The College and Career Readiness Mathematical Practice Scale (CCRMS) was designed to measure…
Descriptors: College Readiness, Career Readiness, Mathematics Instruction, Common Core State Standards
Aydin, Utkun; Ubuz, Behiye – International Journal of Science and Mathematics Education, 2015
Two studies were conducted for the development and validation of a multidimensional test to assess undergraduate students' mathematical thinking about derivative. The first study involved two phases: question generation and refinement of the Thinking-about-Derivative Test (TDT). The second study included four phases as follows: test…
Descriptors: Undergraduate Students, Mathematics Education, Mathematical Concepts, Knowledge Level
van Steensel, Roel; Oostdam, Ron; van Gelderen, Amos – Language Testing, 2013
On the basis of a validation study of a new test for assessing low-achieving adolescents' reading comprehension skills--the SALT-reading--we analyzed two issues relevant to the field of reading test development. Using the test results of 200 seventh graders, we examined the possibility of identifying reading comprehension subskills and the effects…
Descriptors: Adolescents, Low Achievement, Reading Comprehension, Reading Tests
Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013
Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…
Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores
Solano-Flores, Guillermo; Li, Min – Educational Research and Evaluation, 2013
We discuss generalizability (G) theory and the fair and valid assessment of linguistic minorities, especially emergent bilinguals. G theory allows examination of the relationship between score variation and language variation (e.g., variation of proficiency across languages, language modes, and social contexts). Studies examining score variation…
Descriptors: Measurement, Testing, Language Proficiency, Test Construction
Tate, Kevin A.; Rivera, Edil Torres; Conwill, William L.; Miller, M. David; Puig, Ana – Journal for Specialists in Group Work, 2013
There is a clear call in group counseling practice and training for evidence-based practice (ACA, 2005; ASGW, 2008; CACREP, 2009). At the same time, group counselors also are asked to keep clients' experience at the center of their work (ASGW, 2012). This article outlines the authors' effort to develop and study an instrument designed to measure…
Descriptors: Evidence, Group Dynamics, Construct Validity, Group Counseling
Harsch, Claudia; Rupp, Andre Alexander – Language Assessment Quarterly, 2011
The "Common European Framework of Reference" (CEFR; Council of Europe, 2001) provides a competency model that is increasingly used as a point of reference to compare language examinations. Nevertheless, aligning examinations to the CEFR proficiency levels remains a challenge. In this article, we propose a new, level-centered approach to…
Descriptors: Language Tests, Writing Tests, Test Construction, Test Items
Furr, Mike; Bacharach, Verne R. – SAGE Publications (CA), 2007
The authors center their presentation of material around a conceptual understanding of psychometric issues, such as validity and reliability, and on purpose rather than procedure, the "why" rather than the "how to." Their goal is to introduce psychometric principles at a level that is deeper and more focused than found in introductory…
Descriptors: Generalizability Theory, Test Bias, Research Methodology, Testing
Heritage, Margaret; Kim, Jinok; Vendlinski, Terry P.; Herman, Joan L. – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2008
Based on the results of a generalizability study (G study) of measures of teacher knowledge for teaching mathematics developed at The National Center for Research, on Evaluation, Standards, and Student Testing (CRESST) at the University of California, Los Angeles, this report provides evidence that teachers are better at drawing reasonable…
Descriptors: Generalization, Formative Evaluation, Inferences, Mathematics Instruction

King, Daniel W.; King, Lynda A. – Educational and Psychological Measurement, 1983
A three-facet (items, forms, and testing occasions) random effects generalizability analysis was used to evaluate the precision of each of the five domain measures of the Sex-Role Egalitarianism Scale. The recently developed scale measures attitudes toward the equality of males and females. (Author/PN)
Descriptors: Adults, Attitude Measures, Generalizability Theory, Rating Scales