Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 14 |
Since 2006 (last 20 years) | 36 |
Descriptor
Source
Author
Lee, Yong-Won | 6 |
Lee, Guemin | 5 |
Kantor, Robert | 4 |
Brennan, Robert L. | 2 |
Frisbie, David A. | 2 |
Johnson, Robert L. | 2 |
Kreiter, Clarence D. | 2 |
Mollaun, Pam | 2 |
Pianta, Robert C. | 2 |
Raymond, Mark R. | 2 |
Shumate, Steven R. | 2 |
More ▼ |
Publication Type
Journal Articles | 42 |
Reports - Research | 41 |
Reports - Evaluative | 10 |
Speeches/Meeting Papers | 7 |
Numerical/Quantitative Data | 4 |
Tests/Questionnaires | 2 |
Dissertations/Theses -… | 1 |
Information Analyses | 1 |
Education Level
Audience
Location
Turkey | 3 |
Canada | 2 |
Florida | 2 |
Australia | 1 |
Belgium | 1 |
California (Los Angeles) | 1 |
China (Beijing) | 1 |
Colorado (Denver) | 1 |
Hong Kong | 1 |
Iowa | 1 |
Mexico | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 3 |
Advanced Placement… | 1 |
Big Five Inventory | 1 |
Classroom Assessment Scoring… | 1 |
Motivated Strategies for… | 1 |
Test of English for… | 1 |
What Works Clearinghouse Rating
Vispoel, Walter P.; Lee, Hyeryung; Xu, Guanlan; Hong, Hyeri – Journal of Experimental Education, 2023
Although generalizability theory (GT) designs have traditionally been analyzed within an ANOVA framework, identical results can be obtained with structural equation models (SEMs) but extended to represent multiple sources of both systematic and measurement error variance, include estimation methods less likely to produce negative variance…
Descriptors: Generalizability Theory, Structural Equation Models, Programming Languages, Scores
Raymond, Mark R.; Jiang, Zhehan – Educational and Psychological Measurement, 2020
Conventional methods for evaluating the utility of subscores rely on traditional indices of reliability and on correlations among subscores. One limitation of correlational methods is that they do not explicitly consider variation in subtest means. An exception is an index of score profile reliability designated as [G], which quantifies the ratio…
Descriptors: Generalizability Theory, Multivariate Analysis, Scores, Reliability
Carbonneau, Kira J.; Van Orman, Dustin S. J.; Lemberger-Truelove, Matthew E.; Atencio, David J. – Early Education and Development, 2020
Research Findings: Given the variable nature of early childhood settings, practitioners and researchers need better guidance on what conditions influence observations conducted within early childhood settings (National Research Council, 2008). Using 230 observations from 23 three- and four-year-old children, we conducted a Generalizability study…
Descriptors: Classroom Environment, Observation, Preschool Children, Influences
Teker, Gülsen Tasdelen; Güler, Nese – International Journal of Assessment Tools in Education, 2019
One of the important theories in education and psychology is Generalizability (G) Theory and various properties distinguish it from the other measurement theories. To better understand methodological trends of G theory, a thematic content analysis was conducted. This study analyzes the studies using generalizability theory in the field of…
Descriptors: Generalizability Theory, Content Analysis, Foreign Countries, Education
Briggs, Derek C.; Alzen, Jessica L. – Educational and Psychological Measurement, 2019
Observation protocol scores are commonly used as status measures to support inferences about teacher practices. When multiple observations are collected for the same teacher over the course of a year, some portion of a teacher's score on each occasion may be attributable to the rater, lesson, and the time of year of the observation. All three of…
Descriptors: Observation, Inferences, Generalizability Theory, Scores
Nalbantoglu Yilmaz, Funda – Educational Sciences: Theory and Practice, 2017
This study aims to determine the reliability of scores obtained from self-, peer-, and teacher-assessments in terms of teaching materials prepared by teacher candidates. The study group of this research constitutes 56 teacher candidates. In the scope of research, teacher candidates were asked to develop teaching material related to their study.…
Descriptors: Scores, Reliability, Self Evaluation (Individuals), Peer Evaluation
Schmidgall, Jonathan E. – ETS Research Report Series, 2017
This report briefly reviews the design and scoring procedure for the "TOEIC"® Speaking test and summarizes existing evidence about the consistency of TOEIC Speaking test scores. It then describes several analyses conducted using generalizability theory to provide additional information about the consistency of scores across different…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Speech Tests
Martínez, José Felipe; Kloser, Matt; Srinivasan, Jayashri; Stecher, Brian; Edelman, Amanda – Educational and Psychological Measurement, 2022
Adoption of new instructional standards in science demands high-quality information about classroom practice. Teacher portfolios can be used to assess instructional practice and support teacher self-reflection anchored in authentic evidence from classrooms. This study investigated a new type of electronic portfolio tool that allows efficient…
Descriptors: Science Instruction, Academic Standards, Instructional Innovation, Electronic Publishing
McLaughlin, Tara W.; Snyder, Patricia A.; Algina, James – Grantee Submission, 2017
The Learning Target Rating Scale (LTRS) is a measure designed to evaluate the quality of teacher-developed learning targets for embedded instruction for early learning. In the present study, we examined the measurement dependability of LTRS scores by conducting a generalizability study (G-study). We used a partially nested, three-facet model to…
Descriptors: Generalizability Theory, Scores, Rating Scales, Evaluation Methods
DeMars, Christine – Applied Measurement in Education, 2015
In generalizability theory studies in large-scale testing contexts, sometimes a facet is very sparsely crossed with the object of measurement. For example, when assessments are scored by human raters, it may not be practical to have every rater score all students. Sometimes the scoring is systematically designed such that the raters are…
Descriptors: Educational Assessment, Measurement, Data, Generalizability Theory
Lin, Chih-Kai – Language Testing, 2017
Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…
Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy
Schmidgall, Jonathan – Applied Measurement in Education, 2017
This study utilizes an argument-based approach to validation to examine the implications of reliability in order to further differentiate the concepts of score and decision consistency. In a methodological example, the framework of generalizability theory was used to estimate appropriate indices of score consistency and evaluations of the…
Descriptors: Scores, Reliability, Validity, Generalizability Theory
Zhang, Bo; Xiao, Yunnan; Luo, Juan – Language Testing in Asia, 2015
Previous studies comparing holistic scoring to analytic scoring of second language writing have given mixed results. Some of them suffer from methodological drawbacks, such as limited writing sample size, limited number of raters, and lack of direct comparison of the two methods. Based on 300 writing samples graded by 14 raters, this research…
Descriptors: Evaluators, Reliability, Scores, Holistic Approach
Han, Turgay – International Journal of Progressive Education, 2017
The aim of this study is to examine the variability in and reliability of scores assigned to different quality EFL compositions by EFL instructors and their rating behaviors. Using a mixed research design, quantitative data were collected from EFL instructors' ratings of 30 compositions of three different qualities using a holistic scoring rubric.…
Descriptors: English (Second Language), Writing Evaluation, Scores, Expertise
Attali, Yigal – Educational and Psychological Measurement, 2014
This article presents a comparative judgment approach for holistically scored constructed response tasks. In this approach, the grader rank orders (rather than rate) the quality of a small set of responses. A prior automated evaluation of responses guides both set formation and scaling of rankings. Sets are formed to have similar prior scores and…
Descriptors: Responses, Item Response Theory, Scores, Rating Scales