Publication Date
In 2025 | 5 |
Since 2024 | 19 |
Since 2021 (last 5 years) | 73 |
Since 2016 (last 10 years) | 176 |
Since 2006 (last 20 years) | 445 |
Descriptor
Generalizability Theory | 734 |
Reliability | 168 |
Scores | 146 |
Error of Measurement | 134 |
Test Reliability | 126 |
Interrater Reliability | 120 |
Foreign Countries | 103 |
Statistical Analysis | 85 |
Evaluation Methods | 83 |
Psychometrics | 75 |
Research Methodology | 68 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Researchers | 28 |
Practitioners | 2 |
Policymakers | 1 |
Students | 1 |
Location
Turkey | 14 |
Canada | 10 |
United States | 10 |
California | 9 |
Netherlands | 9 |
Australia | 6 |
Germany | 6 |
South Korea | 6 |
Iowa | 5 |
Norway | 5 |
Turkey (Ankara) | 5 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 2 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Oh, Yoonkyung; Osgood, D. Wayne; Smith, Emilie P. – Journal of Early Adolescence, 2015
The importance of afterschool hours for youth development is widely acknowledged, and afterschool settings have recently received increasing attention as an important venue for youth interventions, bringing a growing need for reliable and valid measures of afterschool quality. This study examined the extent to which the two observational tools,…
Descriptors: After School Programs, Program Effectiveness, Observation, Rating Scales
Han, Turgay – International Journal of Progressive Education, 2017
The aim of this study is to examine the variability in and reliability of scores assigned to different quality EFL compositions by EFL instructors and their rating behaviors. Using a mixed research design, quantitative data were collected from EFL instructors' ratings of 30 compositions of three different qualities using a holistic scoring rubric.…
Descriptors: English (Second Language), Writing Evaluation, Scores, Expertise
Swirski, Hani; Baram-Tsabari, Ayelet; Yarden, Anat – International Journal of Science Education, 2018
Context-based approaches can bridge the gap between abstract, difficult science concepts and the world students live in. However, the relevance of specific contexts to different groups of learners, and its stability over time, have not been extensively explored. This study used four datasets, collected in different formal and informal settings, to…
Descriptors: Elementary School Students, Secondary School Students, Student Interests, Learner Engagement
Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015
The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…
Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items
Teker, Gulsen Tasdelen; Guler, Nese; Uyanik, Gulden Kaya – Educational Sciences: Theory and Practice, 2015
Generalizability theory (G theory) provides a broad conceptual framework for social sciences such as psychology and education, and a comprehensive construct for numerous measurement events by using analysis of variance, a strong statistical method. G theory, as an extension of both classical test theory and analysis of variance, is a model which…
Descriptors: Guidelines, Generalizability Theory, Computer Software, Statistical Analysis
Chen, Dezhi; Hu, Bi Ying; Fan, Xitao; Li, Kejian – Journal of Psychoeducational Assessment, 2014
Adapted from the Early Childhood Environment Rating Scale-Revised, the Chinese Early Childhood Program Rating Scale (CECPRS) is a culturally comparable measure for assessing the quality of early childhood education and care programs in the Chinese cultural/social contexts. In this study, 176 kindergarten classrooms were rated with CECPRS on eight…
Descriptors: Foreign Countries, Rating Scales, Early Childhood Education, Educational Environment
Schweig, Jonathan David – Applied Measurement in Education, 2014
Developing indicators that reflect important aspects of school and classroom environments has become central in a nationwide effort to develop comprehensive programs that measure teacher quality and effectiveness. Formulating teacher evaluation policy necessitates accurate and reliable methods for measuring these environmental variables. This…
Descriptors: Error of Measurement, Educational Environment, Classroom Environment, Surveys
Steven Albury – Sage Research Methods Cases, 2014
This case study takes the perception of ethnography as not being generalisable and questions whether such an understanding is correct and also whether a different perspective on generalisability might be helpful in developing formative approaches to educational technology research. The aim is to encourage discussion about ethnography as a tool for…
Descriptors: Technology Uses in Education, Ethnography, Educational Research, Case Studies
Crawford, Angela R.; Johnson, Evelyn S.; Moylan, Laura A.; Zheng, Yuzhu – Grantee Submission, 2018
This study describes the development and initial psychometric evaluation of a Recognizing Effective Special Education Teachers (RESET) teacher observation instrument. Specifically, the study uses generalizability theory to compare two versions of a rubric, one with general descriptors of performance levels and one with item-specific descriptors of…
Descriptors: Special Education Teachers, Direct Instruction, Observation, Teaching Methods
Uto, Masaki; Ueno, Maomi – IEEE Transactions on Learning Technologies, 2016
As an assessment method based on a constructivist approach, peer assessment has become popular in recent years. However, in peer assessment, a problem remains that reliability depends on the rater characteristics. For this reason, some item response models that incorporate rater parameters have been proposed. Those models are expected to improve…
Descriptors: Item Response Theory, Peer Evaluation, Bayesian Statistics, Simulation
Nielsen, Sara E.; Yezierski, Ellen – Journal of Chemical Education, 2015
Though the Chemistry Self-Concept Inventory (CSCI) was developed to study one aspect of the affective domain in college chemistry students, the instrument on which it was based, the Self-Description Questionnaire III, was developed for use with late adolescents. As such, we explored data generated from administering the CSCI to high school…
Descriptors: High School Students, Secondary School Science, Chemistry, Self Concept Measures
Nolen, Susan Bobbitt; Horn, Ilana Seidel; Ward, Christopher J. – Educational Psychologist, 2015
This article describes a situative approach to studying motivation to learn in social contexts. We begin by contrasting this perspective to more prevalent psychological approaches to the study of motivation, describing epistemological and methodological differences that have constrained conversation between theoretical groups. We elaborate on…
Descriptors: Learning Motivation, Learning Theories, Epistemology, Educational Psychology
Briesch, Amy M.; Volpe, Robert J.; Ferguson, Tyler David – School Psychology Quarterly, 2014
Although generalizability theory has been used increasingly in recent years to investigate the dependability of behavioral estimates, many of these studies have relied on use of general education populations as opposed to those students who are most likely to be referred for assessment due to problematic classroom behavior (e.g., inattention,…
Descriptors: Student Characteristics, Reliability, Data, Student Behavior
Attali, Yigal – Educational and Psychological Measurement, 2014
This article presents a comparative judgment approach for holistically scored constructed response tasks. In this approach, the grader rank orders (rather than rate) the quality of a small set of responses. A prior automated evaluation of responses guides both set formation and scaling of rankings. Sets are formed to have similar prior scores and…
Descriptors: Responses, Item Response Theory, Scores, Rating Scales
Keuning, Jos; Hemker, Bas – Educational Research and Evaluation, 2014
The data collection of a cohort study requires making many decisions. Each decision may introduce error in the statistical analyses conducted later on. In the present study, a procedure was developed for estimation of the error made due to the composition of the sample, the item selection procedure, and the test equating process. The math results…
Descriptors: Foreign Countries, Cohort Analysis, Statistical Analysis, Error of Measurement