ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	6
Since 2016 (last 10 years)	19
Since 2006 (last 20 years)	34

Descriptor

Generalizability Theory	59
Scoring	59
Interrater Reliability	27
Scores	21
Reliability	17
Error of Measurement	15
Test Reliability	10
Performance Based Assessment	9
Evaluation Methods	8
Foreign Countries	8
Scoring Rubrics	8
Writing Tests	8
Test Items	7
Writing Evaluation	7
Classroom Observation…	6
Comparative Analysis	6
Evaluators	6
Test Validity	6
Item Response Theory	5
Physicians	5
Psychometrics	5
Test Construction	5
Test Interpretation	5
Computer Assisted Testing	4
Elementary School Students	4
More ▼

Publication Type

Journal Articles	41
Reports - Research	38
Reports - Evaluative	15
Speeches/Meeting Papers	13
Tests/Questionnaires	4
Dissertations/Theses -…	2
Reports - Descriptive	2
Book/Product Reviews	1
Information Analyses	1
Opinion Papers	1

Education Level

Higher Education	6
Postsecondary Education	6
Elementary Education	5
Grade 4	3
Secondary Education	3
Grade 5	2
Early Childhood Education	1
Elementary Secondary Education	1
Grade 6	1
Grade 7	1
Grade 8	1
Junior High Schools	1
Middle Schools	1
Preschool Education	1
More ▼

Audience

Researchers

Location

Turkey	2
United Kingdom	2
Australia	1
Canada	1
Hong Kong	1
Iowa	1
Japan	1
Mexico	1
Netherlands	1
Taiwan	1
United States	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	1
Teacher Performance…	1
Test of English as a Foreign…	1
Test of English for…	1
Texas Assessment of Academic…	1
Trends in International…	1
United States Medical…	1
Work Keys (ACT)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 59 results Save | Export

The Role of Distributional Overlap on the Precision Gain of Bounds for Generalization

Peer reviewed

Direct link

Chan, Wendy – American Journal of Evaluation, 2022

Over the past ten years, propensity score methods have made an important contribution to improving generalizations from studies that do not select samples randomly from a population of inference. However, these methods require assumptions and recent work has considered the role of bounding approaches that provide a range of treatment impact…

Descriptors: Probability, Scores, Scoring, Generalization

Comparison of the Results of the Generalizability Theory with the Inter-Rater Agreement Coefficients

Peer reviewed
PDF on ERIC

Download full text

Eser, Mehmet Taha; Aksu, Gökhan – International Journal of Curriculum and Instruction, 2022

The agreement between raters is examined within the scope of the concept of "inter-rater reliability". Although there are clear definitions of the concepts of agreement between raters and reliability between raters, there is no clear information about the conditions under which agreement and reliability level methods are appropriate to…

Descriptors: Generalizability Theory, Interrater Reliability, Evaluation Methods, Test Theory

The Affectability of Writing Assessment Scores: A G-Theory Analysis of Rater, Task, and Scoring Method Contribution

Peer reviewed

Direct link

Khodi, Ali – Language Testing in Asia, 2021

The present study attempted to to investigate factors which affect EFL writing scores through using generalizability theory (G-theory). To this purpose, one hundred and twenty students participated in one independent and one integrated writing tasks. Proceeding, their performances were scored by six raters: one self-rating, three peers,-rating and…

Descriptors: Writing Tests, Scores, Generalizability Theory, English (Second Language)

Development of the Quantitative Modelling Observation Protocol (QMOP) for Undergraduate Biology Courses: Validity Evidence for Score Interpretation and Uses

Peer reviewed

Direct link

Lyrica Lucas; Anum Khushal; Robert Mayes; Brian A. Couch; Joseph Dauer – International Journal of Science Education, 2025

Educational reform priorities such as emphasis on quantitative modelling (QM) have positioned undergraduate biology instructors as designers of QM experiences to engage students in authentic science practices that support the development of data-driven and evidence-based reasoning. Yet, little is known about how biology instructors adapt to the…

Descriptors: Undergraduate Students, College Science, Biology, Classroom Observation Techniques

Evaluating an Explicit Instruction Teacher Observation Protocol through a Validity Argument Approach

Peer reviewed

Direct link

Johnson, Evelyn S.; Zheng, Yuzhu; Crawford, Angela R.; Moylan, Laura A. – Journal of Experimental Education, 2022

In this study, we examined the scoring and generalizability assumptions of an explicit instruction (EI) special education teacher observation protocol using many-faceted Rasch measurement (MFRM). Video observations of classroom instruction from 48 special education teachers across four states were collected. External raters (n = 20) were trained…

Descriptors: Direct Instruction, Teacher Education, Classroom Observation Techniques, Validity

Evaluating Human Scoring Using Generalizability Theory

Peer reviewed

Direct link

Bimpeh, Yaw; Pointer, William; Smith, Ben Alexander; Harrison, Liz – Applied Measurement in Education, 2020

Many high-stakes examinations in the United Kingdom (UK) use both constructed-response items and selected-response items. We need to evaluate the inter-rater reliability for constructed-response items that are scored by humans. While there are a variety of methods for evaluating rater consistency across ratings in the psychometric literature, we…

Descriptors: Scoring, Generalizability Theory, Interrater Reliability, Foreign Countries

Evaluating an Explicit Instruction Teacher Observation Protocol through a Validity Argument Approach

Peer reviewed
PDF on ERIC

Download full text

Direct link

Johnson, Evelyn S.; Zheng, Yuzhu; Crawford, Angela R.; Moylan, Laura A. – Grantee Submission, 2020

In this study, we examined the scoring and generalizability assumptions of an Explicit Instruction (EI) special education teacher observation protocol using many-faceted Rasch measurement (MFRM). Video observations of classroom instruction from 48 special education teachers across four states were collected. External raters (n = 20) were trained…

Descriptors: Direct Instruction, Teacher Evaluation, Classroom Observation Techniques, Validity

Preservice Observation in Special Education: A Validation Study

Peer reviewed

Direct link

Pua, Daisy J.; Peyton, David J.; Brownell, Mary T.; Contesse, Valentina A.; Jones, Nathan D. – Journal of Learning Disabilities, 2021

Advancing teacher candidates' overall competence through use of valid teacher observation systems should be an essential element of teacher preparation. Yet, the field of special education has not provided observation protocols designed specifically for preservice teachers that are founded in theoretical perspectives and research on effective…

Descriptors: Preservice Teachers, Preservice Teacher Education, Observation, Special Education

Reliability of the Analytic Rubric and Checklist for the Assessment of Story Writing Skills: G and Decision Study in Generalizability Theory

Peer reviewed
PDF on ERIC

Download full text

Uzun, N. Bilge; Alici, Devrim; Aktas, Mehtap – European Journal of Educational Research, 2019

The purpose of study is to examine the reliability of analytical rubrics and checklists developed for the assessment of story writing skills by means of generalizability theory. The study group consisted of 52 students attending the 5th grade at primary school and 20 raters in Mersin University. The G study was carried out with the fully crossed…

Descriptors: Foreign Countries, Scoring Rubrics, Check Lists, Writing Tests

Using Rater Cognition to Improve Generalizability of an Assessment of Scientific Argumentation

Peer reviewed
PDF on ERIC

Download full text

Borowiec, Katrina; Castle, Courtney – Practical Assessment, Research & Evaluation, 2019

Rater cognition or "think-aloud" studies have historically been used to enhance rater accuracy and consistency in writing and language assessments. As assessments are developed for new, complex constructs from the "Next Generation Science Standards (NGSS)," the present study illustrates the utility of extending…

Descriptors: Evaluators, Scoring, Scoring Rubrics, Protocol Analysis

Multivariate Generalizability Analysis of Automated Scoring for Short Answer Items of Social Studies in Large-Scale Assessment

Peer reviewed

Direct link

Sung, Kyung Hee; Noh, Eun Hee; Chon, Kyong Hee – Asia Pacific Education Review, 2017

With increased use of constructed response items in large scale assessments, the cost of scoring has been a major consideration (Noh et al. in KICE Report RRE 2012-6, 2012; Wainer and Thissen in "Applied Measurement in Education" 6:103-118, 1993). In response to the scoring cost issues, various forms of automated system for scoring…

Descriptors: Automation, Scoring, Social Studies, Test Items

The Consistency of "TOEIC"® Speaking Scores across Ratings and Tasks. Research Report. ETS RR-17-46

Peer reviewed
PDF on ERIC

Download full text

Schmidgall, Jonathan E. – ETS Research Report Series, 2017

This report briefly reviews the design and scoring procedure for the "TOEIC"® Speaking test and summarizes existing evidence about the consistency of TOEIC Speaking test scores. It then describes several analyses conducted using generalizability theory to provide additional information about the consistency of scores across different…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Speech Tests

Dependability of Data Derived from Time Sampling Methods with Multiple Observation Targets

Peer reviewed

Direct link

Johnson, Austin H.; Chafouleas, Sandra M.; Briesch, Amy M. – School Psychology Quarterly, 2017

In this study, generalizability theory was used to examine the extent to which (a) time-sampling methodology, (b) number of simultaneous behavior targets, and (c) individual raters influenced variance in ratings of academic engagement for an elementary-aged student. Ten graduate-student raters, with an average of 7.20 hr of previous training in…

Descriptors: Generalizability Theory, Sampling, Elementary School Students, Learner Engagement

Using Generalizability Theory to Examine the Dependability of Scores from the Learning Target Rating Scale

Peer reviewed
PDF on ERIC

Download full text

Direct link

McLaughlin, Tara W.; Snyder, Patricia A.; Algina, James – Grantee Submission, 2017

The Learning Target Rating Scale (LTRS) is a measure designed to evaluate the quality of teacher-developed learning targets for embedded instruction for early learning. In the present study, we examined the measurement dependability of LTRS scores by conducting a generalizability study (G-study). We used a partially nested, three-facet model to…

Descriptors: Generalizability Theory, Scores, Rating Scales, Evaluation Methods

Working with Sparse Data in Rated Language Tests: Generalizability Theory Applications

Peer reviewed

Direct link

Lin, Chih-Kai – Language Testing, 2017

Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…

Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Applied Measurement in…	6
Educational and Psychological…	4
Educational Measurement:…	3
ETS Research Report Series	2
Grantee Submission	2
Language Testing	2
Language Testing in Asia	2
ProQuest LLC	2
Reading Psychology	2
Advances in Health Sciences…	1
American Journal of Evaluation	1
Applied Psychological…	1
Asia Pacific Education Review	1
Assessing Writing	1
Educational Testing Service	1
Eurasian Journal of…	1
European Journal of…	1
European Journal of…	1
Evaluation and the Health…	1
International Journal of…	1
International Journal of…	1
Journal of Educational…	1
Journal of Experimental…	1
Journal of Learning…	1
Journal of Technology and…	1
More ▼

Clauser, Brian E.	3
Clyman, Stephen G.	2
Crawford, Angela R.	2
Haertel, Edward H.	2
Harik, Polina	2
Johnson, Evelyn S.	2
Moylan, Laura A.	2
Solano-Flores, Guillermo	2
Zheng, Yuzhu	2
Aksu, Gökhan	1
Aktas, Mehtap	1
Aldrich, Jennifer	1
Algina, James	1
Alici, Devrim	1
Alkahtani, Saif F.	1
Anderson, Dan	1
Anum Khushal	1
Badjadi, Nour El Imane	1
Bell, Courtney A.	1
Ben-Simon, Anat	1
Bennett, Randy Elliot	1
Bennett, Randy Elliott	1
Bimpeh, Yaw	1
Bordage, Georges	1
More ▼