Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 6 |
Since 2006 (last 20 years) | 12 |
Descriptor
Source
Author
Algina, James | 1 |
Bell, Courtney A. | 1 |
Brown, Timothy T. | 1 |
Casabianca, Jodi M. | 1 |
Cetin, Bayram | 1 |
Chon, Kyong Hee | 1 |
Daniel, Cathy | 1 |
De Kruif, Renee E. L. | 1 |
Dellinger, Amy | 1 |
Denny, R. Kenton | 1 |
Gitomer, Drew H. | 1 |
More ▼ |
Publication Type
Reports - Research | 14 |
Journal Articles | 13 |
Speeches/Meeting Papers | 3 |
Reports - Evaluative | 2 |
Dissertations/Theses -… | 1 |
Tests/Questionnaires | 1 |
Education Level
Secondary Education | 3 |
Early Childhood Education | 1 |
Elementary Education | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
Grade 7 | 1 |
Higher Education | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Postsecondary Education | 1 |
Preschool Education | 1 |
More ▼ |
Audience
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 1 |
Test of English as a Foreign… | 1 |
Test of English for… | 1 |
What Works Clearinghouse Rating
Sung, Kyung Hee; Noh, Eun Hee; Chon, Kyong Hee – Asia Pacific Education Review, 2017
With increased use of constructed response items in large scale assessments, the cost of scoring has been a major consideration (Noh et al. in KICE Report RRE 2012-6, 2012; Wainer and Thissen in "Applied Measurement in Education" 6:103-118, 1993). In response to the scoring cost issues, various forms of automated system for scoring…
Descriptors: Automation, Scoring, Social Studies, Test Items
Schmidgall, Jonathan E. – ETS Research Report Series, 2017
This report briefly reviews the design and scoring procedure for the "TOEIC"® Speaking test and summarizes existing evidence about the consistency of TOEIC Speaking test scores. It then describes several analyses conducted using generalizability theory to provide additional information about the consistency of scores across different…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Speech Tests
McLaughlin, Tara W.; Snyder, Patricia A.; Algina, James – Grantee Submission, 2017
The Learning Target Rating Scale (LTRS) is a measure designed to evaluate the quality of teacher-developed learning targets for embedded instruction for early learning. In the present study, we examined the measurement dependability of LTRS scores by conducting a generalizability study (G-study). We used a partially nested, three-facet model to…
Descriptors: Generalizability Theory, Scores, Rating Scales, Evaluation Methods
Lin, Chih-Kai – Language Testing, 2017
Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…
Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy
Huijgen, Tim; van de Grift, Wim; van Boxtel, Carla; Holthuis, Paul – European Journal of Psychology of Education, 2017
Since the 1970s, many observation instruments have been constructed to map teachers' general pedagogic competencies. However, few of these instruments focus on teachers' subject-specific competencies. This study presents the development of the "Framework for Analyzing the Teaching of Historical Contextualization" (FAT-HC). This…
Descriptors: Measures (Individuals), History Instruction, Teacher Competencies, Content Validity
Zhang, Bo; Xiao, Yunnan; Luo, Juan – Language Testing in Asia, 2015
Previous studies comparing holistic scoring to analytic scoring of second language writing have given mixed results. Some of them suffer from methodological drawbacks, such as limited writing sample size, limited number of raters, and lack of direct comparison of the two methods. Based on 300 writing samples graded by 14 raters, this research…
Descriptors: Evaluators, Reliability, Scores, Holistic Approach
Cetin, Bayram; Guler, Nese; Sarica, Rabia – Eurasian Journal of Educational Research, 2016
Problem Statement: In addition to being teaching tools, concept maps can be used as effective assessment tools. The use of concept maps for assessment has raised the issue of scoring them. Concept maps generated and used in different ways can be scored via various methods. Holistic and relational scoring methods are two of them. Purpose of the…
Descriptors: Generalizability Theory, Concept Mapping, Scoring, Scoring Formulas
Vispoel, Walter P.; Tao, Shuqin – Psychological Assessment, 2013
Our goal in this investigation was to evaluate the reliability of scores from the Balanced Inventory of Desirable Responding (BIDR) more comprehensively than in prior research using a generalizability-theory framework based on both dichotomous and polytomous scoring of items. Generalizability coefficients accounting for specific-factor, transient,…
Descriptors: Reliability, Scores, Measures (Individuals), Generalizability Theory
Haertel, Edward H. – Educational Testing Service, 2013
Policymakers and school administrators have embraced value-added models of teacher effectiveness as tools for educational improvement. Teacher value-added estimates may be viewed as complicated scores of a certain kind. This suggests using a test validation model to examine their reliability and validity. Validation begins with an interpretive…
Descriptors: Reliability, Validity, Inferences, Teacher Effectiveness
Casabianca, Jodi M.; McCaffrey, Daniel F.; Gitomer, Drew H.; Bell, Courtney A.; Hamre, Bridget K.; Pianta, Robert C. – Educational and Psychological Measurement, 2013
Classroom observation of teachers is a significant part of educational measurement; measurements of teacher practice are being used in teacher evaluation systems across the country. This research investigated whether observations made live in the classroom and from video recording of the same lessons yielded similar inferences about teaching.…
Descriptors: Secondary School Mathematics, Mathematics Instruction, Classroom Observation Techniques, Algebra
Kachchaf, Rachel; Solano-Flores, Guillermo – Applied Measurement in Education, 2012
We examined how rater language background affects the scoring of short-answer, open-ended test items in the assessment of English language learners (ELLs). Four native English and four native Spanish-speaking certified bilingual teachers scored 107 responses of fourth- and fifth-grade Spanish-speaking ELLs to mathematics items administered in…
Descriptors: Error of Measurement, English Language Learners, Scoring, Bilingual Teachers
Lengh, Carolyn J. – ProQuest LLC, 2010
This study compares the dependability of four classroom assessment scoring methods. Generalizability theory (G) and alternative decision (D) are used to measure the results of students' classroom assessment scores and compare the results of the four scoring methods on variability of rater by person variance and the level of G and D coefficients…
Descriptors: Generalizability Theory, Scoring, Social Studies, Tests

Swartz, Carl W.; Hooper, Stephen R.; Mongomery, James W.; Wakely, Melissa B.; De Kruif, Renee E. L.; Reed, Martha; Brown, Timothy T.; Levine, Melvin D.; White, Kinnard P. – Educational and Psychological Measurement, 1999
Used generalizability theory to investigate the impact of the number of raters and the type of decision (relative versus absolute) on the reliability of writing scores. Results from 251 middle school students and 20 intermediate grade students show that reliability coefficients decline as the number of raters declines and when absolute decisions…
Descriptors: Estimation (Mathematics), Generalizability Theory, Holistic Evaluation, Intermediate Grades
Jiang, Ying Hong; Smith, Philip L. – 2000
With a construct-centered reliability analytical approach the reliability analysis should crystallize the multi-traits or constructs that the test specialists developed to measure from student performance and then estimate the degree of fit between the theoretical expectations of test developers and the performance exhibited by students. This…
Descriptors: Cognitive Tests, Construct Validity, Elementary Education, Elementary School Students
Lee, Yong-Won; Kantor, Robert – ETS Research Report Series, 2005
Possible integrated and independent tasks were pilot tested for the writing section of a new generation of TOEFL® (Test of English as a Foreign Language™) examination. This study examines the impact of various rating designs as well as the impact of the number of tasks and raters on the reliability of writing scores based on integrated and…
Descriptors: Language Tests, English (Second Language), Second Language Learning, Writing Tests
Previous Page | Next Page »
Pages: 1 | 2