Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 16 |
Descriptor
Generalizability Theory | 17 |
Grade 8 | 13 |
Foreign Countries | 6 |
Mathematics Tests | 6 |
Scores | 5 |
Test Reliability | 5 |
Adolescents | 4 |
Correlation | 4 |
Interrater Reliability | 4 |
Reading Tests | 4 |
Statistical Analysis | 4 |
More ▼ |
Source
Author
Publication Type
Journal Articles | 14 |
Reports - Research | 14 |
Reports - Evaluative | 3 |
Numerical/Quantitative Data | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Grade 8 | 17 |
Middle Schools | 9 |
Elementary Education | 8 |
Secondary Education | 8 |
Junior High Schools | 7 |
Grade 10 | 5 |
Grade 7 | 5 |
Grade 9 | 4 |
High Schools | 4 |
Elementary Secondary Education | 3 |
Grade 4 | 3 |
More ▼ |
Audience
Location
Germany | 2 |
Norway | 2 |
Australia | 1 |
Austria | 1 |
California | 1 |
China | 1 |
Czech Republic | 1 |
Finland | 1 |
Hong Kong | 1 |
Hungary | 1 |
Ireland | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 1 |
Progress in International… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Huebner, Alan; Skar, Gustaf B. – Practical Assessment, Research & Evaluation, 2021
Writing assessments often consist of students responding to multiple prompts, which are judged by more than one rater. To establish the reliability of these assessments, there exist different methods to disentangle variation due to prompts and raters, including classical test theory, Many Facet Rasch Measurement (MFRM), and Generalizability Theory…
Descriptors: Error of Measurement, Test Theory, Generalizability Theory, Item Response Theory
Atilgan, Hakan; Demir, Elif Kübra; Ogretmen, Tuncay; Basokcu, Tahsin Oguz – International Journal of Progressive Education, 2020
It has become a critical question what the reliability level would be when open-ended questions are used in large-scale selection tests. One of the aims of the present study is to determine what the reliability would be in the event that the answers given by test-takers are scored by experts when open-ended short answer questions are used in…
Descriptors: Foreign Countries, Secondary School Students, Test Items, Test Reliability
Keller, Lena; Preckel, Franzis; Brunner, Martin – Journal of Educational Psychology, 2021
It is well-documented that academic achievement is associated with students' self-perceptions of their academic abilities, that is, their academic self-concepts. However, low-achieving students may apply self-protective strategies to maintain a favorable academic self-concept when evaluating their academic abilities. Consequently, the relation…
Descriptors: Correlation, Academic Achievement, High Achievement, Low Achievement
Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013
Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…
Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores
Bloom, Howard S.; Porter, Kristin E. – Society for Research on Educational Effectiveness, 2012
In recent years, the regression discontinuity design (RDD) has gained widespread recognition as a quasi-experimental method that when used correctly, can produce internally valid estimates of causal effects of a treatment, a program or an intervention (hereafter referred to as treatment effects). In an RDD study, subjects or groups of subjects…
Descriptors: Regression (Statistics), Research Design, Computation, Generalizability Theory
Sao Pedro, Michael A.; Baker, Ryan S. J. d.; Gobert, Janice D. – Grantee Submission, 2013
When validating assessment models built with data mining, generalization is typically tested at the student-level, where models are tested on new students. This approach, though, may fail to find cases where model performance suffers if other aspects of those cases relevant to prediction are not well represented. We explore this here by testing if…
Descriptors: Educational Research, Data Collection, Data Analysis, Generalizability Theory
Kang, Yanrong; Moore, Joyce – Online Submission, 2011
Parenting style, as a widely studied topic, has been used by researchers and educators in the US to predict students' academic achievements. Despite its theoretical and practical significance, no much work has been conducted to test the generalizability of parenting research framed in the Western culture to the Chinese population. Parenting styles…
Descriptors: Parenting Styles, Child Rearing, Adolescents, Foreign Countries
Boyd, Donald; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – Journal of Educational and Behavioral Statistics, 2013
Test-based accountability as well as value-added asessments and much experimental and quasi-experimental research in education rely on achievement tests to measure student skills and knowledge. Yet, we know little regarding fundamental properties of these tests, an important example being the extent of measurement error and its implications for…
Descriptors: Accountability, Educational Research, Educational Testing, Error of Measurement
Mashburn, Andrew J.; Meyer, J. Patrick; Allen, Joseph P.; Pianta, Robert C. – Educational and Psychological Measurement, 2014
Observational methods are increasingly being used in classrooms to evaluate the quality of teaching. Operational procedures for observing teachers are somewhat arbitrary in existing measures and vary across different instruments. To study the effect of different observation procedures on score reliability and validity, we conducted an experimental…
Descriptors: Observation, Teacher Evaluation, Reliability, Validity
Anderson, Daniel; Alonzo, Julie; Tindal, Gerald – Behavioral Research and Teaching, 2012
In this technical report, we describe the results of a study of mathematics items written to align with the Common Core State Standards (CCSS) in grades 6-8. In each grade, CCSS items were organized into forms, and the reliability of these forms was evaluated along with an experimental form including items aligned with the National Council of…
Descriptors: Curriculum Based Assessment, Mathematics Tests, Academic Standards, State Standards
Guler, Nese; Gelbal, Selahattin – Educational Sciences: Theory and Practice, 2010
In this study, the Classical test theory and generalizability theory were used for determination to reliability of scores obtained from measurement tool of mathematics success. 24 open-ended mathematics question of the TIMSS-1999 was applied to 203 students in 2007-spring semester. Internal consistency of scores was found as 0.92. For…
Descriptors: Generalizability Theory, Test Theory, Test Reliability, Interrater Reliability
Harsch, Claudia; Rupp, Andre Alexander – Language Assessment Quarterly, 2011
The "Common European Framework of Reference" (CEFR; Council of Europe, 2001) provides a competency model that is increasingly used as a point of reference to compare language examinations. Nevertheless, aligning examinations to the CEFR proficiency levels remains a challenge. In this article, we propose a new, level-centered approach to…
Descriptors: Language Tests, Writing Tests, Test Construction, Test Items
Newton, Xiaoxia A. – Studies in Educational Evaluation, 2010
This paper reported results from a generalizability study that examined the process of developing classroom practice indicators used to evaluate the impact of a school district's mathematics reform initiative. The study utilized classroom observational data from 32 second, fourth, eighth, and tenth grade teachers. The study addresses important…
Descriptors: Generalizability Theory, Theory Practice Relationship, Program Effectiveness, Grade 10
Yin, Yue; Shavelson, Richard J. – Applied Measurement in Education, 2008
In the first part of this article, the use of Generalizability (G) theory in examining the dependability of concept map assessment scores and designing a concept map assessment for a particular practical application is discussed. In the second part, the application of G theory is demonstrated by comparing the technical qualities of two frequently…
Descriptors: Generalizability Theory, Concept Mapping, Validity, Reliability
Sung, Yao-Ting; Chang, Kuo-En; Chang, Tzyy-Hua; Yu, Wen-Cheng – Journal of Adolescence, 2010
Self- and peer assessments are becoming more popular in classrooms, but there are few data on the reliability and validity of such assessments performed by school children. Because these factors are greatly affected by the number of raters, we conducted two studies to determine the rating behaviours of teenagers in self- and peer assessments, and…
Descriptors: Generalizability Theory, Peer Evaluation, Validity, Reliability
Previous Page | Next Page »
Pages: 1 | 2