Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 13 |
Descriptor
Comparative Analysis | 17 |
Generalizability Theory | 17 |
Reliability | 17 |
Scores | 5 |
Error of Measurement | 4 |
Statistical Analysis | 4 |
Multivariate Analysis | 3 |
Scoring | 3 |
Scoring Rubrics | 3 |
Classroom Environment | 2 |
Correlation | 2 |
More ▼ |
Source
Author
Lee, Guemin | 2 |
Brennan, Robert L. | 1 |
Chang, Kuo-En | 1 |
Chang, Tzyy-Hua | 1 |
Chon, Kyong Hee | 1 |
Daniel, Cathy | 1 |
DeBrock, Lindsay | 1 |
Dellinger, Amy | 1 |
Denny, R. Kenton | 1 |
Hakstian, A. Ralph | 1 |
Heilmann, John | 1 |
More ▼ |
Publication Type
Journal Articles | 13 |
Reports - Research | 10 |
Reports - Evaluative | 5 |
Dissertations/Theses -… | 2 |
Numerical/Quantitative Data | 1 |
Education Level
Elementary Secondary Education | 3 |
Middle Schools | 2 |
Elementary Education | 1 |
Grade 7 | 1 |
Grade 8 | 1 |
Higher Education | 1 |
Junior High Schools | 1 |
Kindergarten | 1 |
Postsecondary Education | 1 |
Secondary Education | 1 |
Audience
Location
North Carolina | 3 |
California | 2 |
Canada | 1 |
Florida | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Sung, Kyung Hee; Noh, Eun Hee; Chon, Kyong Hee – Asia Pacific Education Review, 2017
With increased use of constructed response items in large scale assessments, the cost of scoring has been a major consideration (Noh et al. in KICE Report RRE 2012-6, 2012; Wainer and Thissen in "Applied Measurement in Education" 6:103-118, 1993). In response to the scoring cost issues, various forms of automated system for scoring…
Descriptors: Automation, Scoring, Social Studies, Test Items
Schweig, Jonathan David – Applied Measurement in Education, 2014
Developing indicators that reflect important aspects of school and classroom environments has become central in a nationwide effort to develop comprehensive programs that measure teacher quality and effectiveness. Formulating teacher evaluation policy necessitates accurate and reliable methods for measuring these environmental variables. This…
Descriptors: Error of Measurement, Educational Environment, Classroom Environment, Surveys
Yang, Yanyun; Oosterhof, Albert; Xia, Yan – Journal of Educational Research, 2015
The authors address the reliability of scores obtained on the summative performance assessments during the pilot year of our research. Contrary to classical test theory, we discussed the advantages of using generalizability theory for estimating reliability of scores for summative performance assessments. Generalizability theory was used as the…
Descriptors: Summative Evaluation, Comparative Analysis, Reliability, Scores
Lee, Guemin; Park, In-Yong – Asia Pacific Education Review, 2012
Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…
Descriptors: Generalizability Theory, Simulation, Computation, Item Response Theory
Shih, Jeffrey C.; Ing, Marsha; Tarr, James E. – Middle Grades Research Journal, 2013
One method to investigate classroom quality is for a person to observe what is happening in the classroom. However, this method raises practical and technical concerns such as how many observations to collect, when to collect these observations and who should collect these observations. The purpose of this study is to provide empirical evidence to…
Descriptors: Observation, Longitudinal Studies, Mathematics, Middle School Teachers
Schweig, Jonathan – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2013
Measuring school and classroom environments has become central in a nation-wide effort to develop comprehensive programs that measure teacher quality and teacher effectiveness. Formulating successful programs necessitates accurate and reliable methods for measuring these environmental variables. This paper uses a generalizability theory framework…
Descriptors: Error of Measurement, Hierarchical Linear Modeling, Educational Environment, Classroom Environment
Heilmann, John; DeBrock, Lindsay; Riley-Tillman, T. Chris – American Journal of Speech-Language Pathology, 2013
Purpose: The purpose of this study was to examine the reliability of, and sources of variability in, language measures from interviews collected from young school-age children. Method: Two 10-min interviews were collected from 20 at-risk kindergarten children by an examiner using a standardized set of questions. Test-retest reliability…
Descriptors: Measures (Individuals), Structured Interviews, Reliability, Kindergarten
Orem, Chris D. – ProQuest LLC, 2012
Meta-assessment, or the assessment of assessment, can provide meaningful information about the trustworthiness of an academic program's assessment results (Bresciani, Gardner, & Hickmott, 2009; Palomba & Banta, 1999; Suskie, 2009). Many institutions conduct meta-assessments for their academic programs (Fulcher, Swain, & Orem, 2012),…
Descriptors: Validity, Evidence, Evaluation Methods, Meta Analysis
Lengh, Carolyn J. – ProQuest LLC, 2010
This study compares the dependability of four classroom assessment scoring methods. Generalizability theory (G) and alternative decision (D) are used to measure the results of students' classroom assessment scores and compare the results of the four scoring methods on variability of rater by person variance and the level of G and D coefficients…
Descriptors: Generalizability Theory, Scoring, Social Studies, Tests
Jeon, Min-Jeong; Lee, Guemin; Hwang, Jeong-Won; Kang, Sang-Jin – Asia Pacific Education Review, 2009
The purpose of this study was to investigate the methods of estimating the reliability of school-level scores using generalizability theory and multilevel models. Two approaches, "student within schools" and "students within schools and subject areas," were conceptualized and implemented in this study. Four methods resulting from the combination…
Descriptors: Generalizability Theory, Scores, Reliability, Statistical Analysis
Sung, Yao-Ting; Chang, Kuo-En; Chang, Tzyy-Hua; Yu, Wen-Cheng – Journal of Adolescence, 2010
Self- and peer assessments are becoming more popular in classrooms, but there are few data on the reliability and validity of such assessments performed by school children. Because these factors are greatly affected by the number of raters, we conducted two studies to determine the rating behaviours of teenagers in self- and peer assessments, and…
Descriptors: Generalizability Theory, Peer Evaluation, Validity, Reliability
Moses, Tim; Kim, Sooyeon – ETS Research Report Series, 2007
This study evaluated the impact of unequal reliability on test equating methods in the nonequivalent groups with anchor test (NEAT) design. Classical true score-based models were compared in terms of their assumptions about how reliability impacts test scores. These models were related to treatment of population ability differences by different…
Descriptors: Reliability, Equated Scores, Test Items, Statistical Analysis
Huang, Jinyan – Assessing Writing, 2008
Using generalizability theory, this study examined both the rating variability and reliability of ESL students' writing in the provincial English examinations in Canada. Three years' data were used in order to complete the analyses and examine the stability of the results. The major research question that guided this study was: Are there any…
Descriptors: Generalizability Theory, Foreign Countries, English (Second Language), Writing Tests

Schroeder, Marsha L.; Hakstian, A. Ralph – Psychometrika, 1990
A 2-facet measurement model is identified, and its coefficient of generalizability (CG) is examined. Three other multifaceted measurement models and their CGs are identified. An empirical investigation of all four procedures is conducted using data from a study of the psychopathology of 71 prison inmates. (SLD)
Descriptors: Comparative Analysis, Equations (Mathematics), Generalizability Theory, Mathematical Models

Marcoulides, George A. – Educational and Psychological Measurement, 1994
Effects of different weighting schemes on selecting the optimal number of observations in multivariate-multifacet generalizability designs are studied when cost constraints are imposed. Comparison of four schemes through simulation indicates that all four produce similar optimal values and that reliability should be similar. (SLD)
Descriptors: Budgeting, Comparative Analysis, Costs, Factor Analysis
Previous Page | Next Page ยป
Pages: 1 | 2