NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)3
Since 2006 (last 20 years)12
Publication Type
Reports - Evaluative17
Journal Articles14
Speeches/Meeting Papers3
Information Analyses1
Audience
Researchers1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 17 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Bimpeh, Yaw; Pointer, William; Smith, Ben Alexander; Harrison, Liz – Applied Measurement in Education, 2020
Many high-stakes examinations in the United Kingdom (UK) use both constructed-response items and selected-response items. We need to evaluate the inter-rater reliability for constructed-response items that are scored by humans. While there are a variety of methods for evaluating rater consistency across ratings in the psychometric literature, we…
Descriptors: Scoring, Generalizability Theory, Interrater Reliability, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Strietholt, Rolf; Scherer, Ronny – Scandinavian Journal of Educational Research, 2018
The present paper aims to discuss how data from international large-scale assessments (ILSAs) can be utilized and combined, even with other existing data sources, in order to monitor educational outcomes and study the effectiveness of educational systems. We consider different purposes of linking data, namely, extending outcomes measures,…
Descriptors: International Assessment, Group Testing, Outcomes of Education, Outcome Measures
Peer reviewed Peer reviewed
Direct linkDirect link
Sohn, Kitae – School Effectiveness and School Improvement, 2016
Understanding the effects of class size reduction (CSR) has been an enduring issue in education. For the past 3 decades, Project STAR has stimulated research and policy discussions regarding the effects of CSR on a variety of outcomes. Schanzenbach (2007) reviewed STAR studies and concluded that small classes improved student academic outcomes.…
Descriptors: Class Size, Small Classes, Educational Policy, Outcomes of Education
Peer reviewed Peer reviewed
Direct linkDirect link
Rantanen, Pekka – Assessment & Evaluation in Higher Education, 2013
A multilevel analysis approach was used to analyse students' evaluation of teaching (SET). The low value of inter-rater reliability stresses that any solid conclusions on teaching cannot be made on the basis of single feedbacks. To assess a teacher's general teaching effectiveness, one needs to evaluate four randomly chosen course implementations.…
Descriptors: Test Reliability, Feedback (Response), Generalizability Theory, Student Evaluation of Teacher Performance
Peer reviewed Peer reviewed
Direct linkDirect link
Leclerc, Bernard-Simon; Dassa, Clement – Canadian Journal of Program Evaluation, 2009
This study examines the usefulness of the Montreal Service Concept framework of service quality measurement, when it was used as a predefined set of codes in content analysis of patients' responses. As well, the study quantifies the interrater agreement of coded data. Two raters independently reviewed each of the responses from a mail survey of…
Descriptors: Interrater Reliability, Content Analysis, Health Services, Mail Surveys
Chua, Boon Liang – Australian Mathematics Teacher, 2009
Pattern generalising problems offer a very rich context for exploring relationships among quantities, expressing generality and representing the same relationship in different ways. Selecting appropriate tasks for students to work on in class is by no means a straightforward process, but there are ways to handle it. This article aims to explore…
Descriptors: Difficulty Level, Generalizability Theory, Instructional Design, Mathematics Instruction
Peer reviewed Peer reviewed
Direct linkDirect link
Burch, V. C.; Norman, G. R.; Schmidt, H. G.; van der Vleuten, C. P. M. – Advances in Health Sciences Education, 2008
High stakes postgraduate specialist certification examinations have considerable implications for the future careers of examinees. Medical colleges and professional boards have a social and professional responsibility to ensure their fitness for purpose. To date there is a paucity of published data about the reliability of specialist certification…
Descriptors: Generalizability Theory, Physicians, Foreign Countries, Specialists
Peer reviewed Peer reviewed
Direct linkDirect link
Gebril, Atta – Language Testing, 2009
Generalizability of writing scores has always been a longstanding concern in L2 writing assessment. A number of studies have been conducted to investigate this topic during the last two decades. However, with the introduction of new test methods, such as reading-to-write tasks, generalizability studies need to focus on the score accuracy of…
Descriptors: Generalizability Theory, Writing Evaluation, Writing Tests, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Clauser, Brian E.; Harik, Polina; Margolis, Melissa J.; McManus, I. C.; Mollon, Jennifer; Chis, Liliana; Williams, Simon – Applied Measurement in Education, 2009
Numerous studies have compared the Angoff standard-setting procedure to other standard-setting methods, but relatively few studies have evaluated the procedure based on internal criteria. This study uses a generalizability theory framework to evaluate the stability of the estimated cut score. To provide a measure of internal consistency, this…
Descriptors: Generalizability Theory, Group Discussion, Standard Setting (Scoring), Scoring
Atilgan, Hakan – International Journal of Research & Method in Education, 2008
The "Special Ability Selection Examination" (SASE), which is used to select appropriate students for the music education departments of educational faculties in Turkey, has many subsections and must evaluate highly competitive cohorts of students according to a broad range of criteria. The test consists of three subsections, with a large…
Descriptors: Generalizability Theory, Schools of Education, Music Education, Music
Peer reviewed Peer reviewed
Hagtvet, Knut A. – Scandinavian Journal of Educational Research, 1998
Demonstrates how perspectives from covariance structural modeling and generalizability theory can be combined for a comprehensive assessment of latent constructs. This approach to examining variance components is illustrated by one- and two- facet designs, and can be extended to more complex designs. (MAK)
Descriptors: Analysis of Covariance, Factor Analysis, Foreign Countries, Generalizability Theory
PDF pending restoration PDF pending restoration
Sanders, Piet F. – 1993
A study on sampling errors of variance components was conducted within the framework of generalizability theory by P. L. Smith (1978). The study used an intuitive approach for solving the problem of how to allocate the number of conditions to different facets in order to produce the most stable estimate of the universe score variance. Optimization…
Descriptors: Decision Making, Equations (Mathematics), Estimation (Mathematics), Foreign Countries
Peer reviewed Peer reviewed
Bell, John F. – Journal of Educational Statistics, 1986
Khuri's and Satterthwaite's methods of obtaining confidence intervals of variance components are compared. The article discusses that Khuri's method may be applied to obtain confidence intervals for the variance components and other linear functions of the expected mean squares used in generalizability theory. (Author/JAZ)
Descriptors: Analysis of Variance, Elementary Education, Equations (Mathematics), Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Van Moere, Alistair – Language Testing, 2006
This article investigates a group oral test as administered at a university in Japan to find if it is appropriate to use scores for higher stakes decision making. It is one component of an in-house English proficiency test used for placing students, evaluating their progress, and making informed decisions for the development of the English…
Descriptors: Foreign Countries, Generalizability Theory, Achievement Tests, English (Second Language)
Peer reviewed Peer reviewed
Direct linkDirect link
Solano-Flores, Guillermo; Li, Min – Educational Measurement: Issues and Practice, 2006
We contend that generalizability (G) theory allows the design of psychometric approaches to testing English-language learners (ELLs) that are consistent with current thinking in linguistics. We used G theory to estimate the amount of measurement error due to code (language or dialect). Fourth- and fifth-grade ELLs, native speakers of…
Descriptors: Foreign Countries, Grade 4, Grade 5, English (Second Language)
Previous Page | Next Page ยป
Pages: 1  |  2