ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	12

Descriptor

Foreign Countries	17
Generalizability Theory	17
Interrater Reliability	5
Test Items	4
English (Second Language)	3
Error of Measurement	3
High Stakes Tests	3
Scoring	3
Test Reliability	3
Academic Achievement	2
Achievement Tests	2
Comparative Education	2
Cutting Scores	2
Difficulty Level	2
Elementary Secondary Education	2
Equations (Mathematics)	2
Evaluation Methods	2
Grade 4	2
Group Testing	2
Mathematical Models	2
Outcomes of Education	2
Physicians	2
Problem Solving	2
Sampling	2
Scores	2
More ▼

Source

Applied Measurement in…	2
Language Testing	2
Scandinavian Journal of…	2
Advances in Health Sciences…	1
Assessment & Evaluation in…	1
Australian Mathematics Teacher	1
Canadian Journal of Program…	1
Educational Measurement:…	1
International Journal of…	1
Journal of Educational…	1
School Effectiveness and…	1
More ▼

Publication Type

Reports - Evaluative	17
Journal Articles	14
Speeches/Meeting Papers	3
Information Analyses	1

Education Level

Higher Education	4
Elementary Secondary Education	1
Grade 4	1
Grade 5	1

Audience

Researchers

Location

United Kingdom	3
Japan	2
Australia	1
Canada (Montreal)	1
Denmark	1
Egypt	1
Finland (Helsinki)	1
Haiti	1
Norway	1
South Africa	1
Turkey	1
United States	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Trends in International…	2
Progress in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

Evaluating Human Scoring Using Generalizability Theory

Peer reviewed

Direct link

Bimpeh, Yaw; Pointer, William; Smith, Ben Alexander; Harrison, Liz – Applied Measurement in Education, 2020

Many high-stakes examinations in the United Kingdom (UK) use both constructed-response items and selected-response items. We need to evaluate the inter-rater reliability for constructed-response items that are scored by humans. While there are a variety of methods for evaluating rater consistency across ratings in the psychometric literature, we…

Descriptors: Scoring, Generalizability Theory, Interrater Reliability, Foreign Countries

The Contribution of International Large-Scale Assessments to Educational Research: Combining Individual and Institutional Data Sources

Peer reviewed

Direct link

Strietholt, Rolf; Scherer, Ronny – Scandinavian Journal of Educational Research, 2018

The present paper aims to discuss how data from international large-scale assessments (ILSAs) can be utilized and combined, even with other existing data sources, in order to monitor educational outcomes and study the effectiveness of educational systems. We consider different purposes of linking data, namely, extending outcomes measures,…

Descriptors: International Assessment, Group Testing, Outcomes of Education, Outcome Measures

A Review of Research on Project STAR and Path Ahead

Peer reviewed

Direct link

Sohn, Kitae – School Effectiveness and School Improvement, 2016

Understanding the effects of class size reduction (CSR) has been an enduring issue in education. For the past 3 decades, Project STAR has stimulated research and policy discussions regarding the effects of CSR on a variety of outcomes. Schanzenbach (2007) reviewed STAR studies and concluded that small classes improved student academic outcomes.…

Descriptors: Class Size, Small Classes, Educational Policy, Outcomes of Education

The Number of Feedbacks Needed for Reliable Evaluation. A Multilevel Analysis of the Reliability, Stability and Generalisability of Students' Evaluation of Teaching

Peer reviewed

Direct link

Rantanen, Pekka – Assessment & Evaluation in Higher Education, 2013

A multilevel analysis approach was used to analyse students' evaluation of teaching (SET). The low value of inter-rater reliability stresses that any solid conclusions on teaching cannot be made on the basis of single feedbacks. To assess a teacher's general teaching effectiveness, one needs to evaluate four randomly chosen course implementations.…

Descriptors: Test Reliability, Feedback (Response), Generalizability Theory, Student Evaluation of Teacher Performance

Interrater Reliability in Content Analysis of Healthcare Service Quality Using Montreal's Conceptual Framework

Peer reviewed

Direct link

Leclerc, Bernard-Simon; Dassa, Clement – Canadian Journal of Program Evaluation, 2009

This study examines the usefulness of the Montreal Service Concept framework of service quality measurement, when it was used as a predefined set of codes in content analysis of patients' responses. As well, the study quantifies the interrater agreement of coded data. Two raters independently reviewed each of the responses from a mail survey of…

Descriptors: Interrater Reliability, Content Analysis, Health Services, Mail Surveys

Features of Generalising Tasks: Help or Hurdle to Expressing Generality?

Peer reviewed
PDF on ERIC

Download full text

Direct link

Chua, Boon Liang – Australian Mathematics Teacher, 2009

Pattern generalising problems offer a very rich context for exploring relationships among quantities, expressing generality and representing the same relationship in different ways. Selecting appropriate tasks for students to work on in class is by no means a straightforward process, but there are ways to handle it. This article aims to explore…

Descriptors: Difficulty Level, Generalizability Theory, Instructional Design, Mathematics Instruction

Are Specialist Certification Examinations a Reliable Measure of Physician Competence?

Peer reviewed

Direct link

Burch, V. C.; Norman, G. R.; Schmidt, H. G.; van der Vleuten, C. P. M. – Advances in Health Sciences Education, 2008

High stakes postgraduate specialist certification examinations have considerable implications for the future careers of examinees. Medical colleges and professional boards have a social and professional responsibility to ensure their fitness for purpose. To date there is a paucity of published data about the reliability of specialist certification…

Descriptors: Generalizability Theory, Physicians, Foreign Countries, Specialists

Score Generalizability of Academic Writing Tasks: Does One Test Method Fit It All?

Peer reviewed

Direct link

Gebril, Atta – Language Testing, 2009

Generalizability of writing scores has always been a longstanding concern in L2 writing assessment. A number of studies have been conducted to investigate this topic during the last two decades. However, with the introduction of new test methods, such as reading-to-write tasks, generalizability studies need to focus on the score accuracy of…

Descriptors: Generalizability Theory, Writing Evaluation, Writing Tests, Scores

An Empirical Examination of the Impact of Group Discussion and Examinee Performance Information on Judgments Made in the Angoff Standard-Setting Procedure

Peer reviewed

Direct link

Clauser, Brian E.; Harik, Polina; Margolis, Melissa J.; McManus, I. C.; Mollon, Jennifer; Chis, Liliana; Williams, Simon – Applied Measurement in Education, 2009

Numerous studies have compared the Angoff standard-setting procedure to other standard-setting methods, but relatively few studies have evaluated the procedure based on internal criteria. This study uses a generalizability theory framework to evaluate the stability of the estimated cut score. To provide a measure of internal consistency, this…

Descriptors: Generalizability Theory, Group Discussion, Standard Setting (Scoring), Scoring

Using Generalizability Theory to Assess the Score Reliability of the Special Ability Selection Examinations for Music Education Programmes in Higher Education

Direct link

Atilgan, Hakan – International Journal of Research & Method in Education, 2008

The "Special Ability Selection Examination" (SASE), which is used to select appropriate students for the music education departments of educational faculties in Turkey, has many subsections and must evaluate highly competitive cohorts of students according to a broad range of criteria. The test consists of three subsections, with a large…

Descriptors: Generalizability Theory, Schools of Education, Music Education, Music

Assessment of Latent Constructs: A Joint Application of Generalizability Theory and Covariance Modelling with an Emphasis on Inference and Structure.

Peer reviewed

Hagtvet, Knut A. – Scandinavian Journal of Educational Research, 1998

Demonstrates how perspectives from covariance structural modeling and generalizability theory can be combined for a comprehensive assessment of latent constructs. This approach to examining variance components is illustrated by one- and two- facet designs, and can be extended to more complex designs. (MAK)

Descriptors: Analysis of Covariance, Factor Analysis, Foreign Countries, Generalizability Theory

Sampling Errors of Variance Components.

PDF pending restoration

Sanders, Piet F. – 1993

A study on sampling errors of variance components was conducted within the framework of generalizability theory by P. L. Smith (1978). The study used an intuitive approach for solving the problem of how to allocate the number of conditions to different facets in order to produce the most stable estimate of the universe score variance. Optimization…

Descriptors: Decision Making, Equations (Mathematics), Estimation (Mathematics), Foreign Countries

Simultaneous Confidence Intervals for the Linear Functions of Expected Means Squares used in Generalizability Theory.

Peer reviewed

Bell, John F. – Journal of Educational Statistics, 1986

Khuri's and Satterthwaite's methods of obtaining confidence intervals of variance components are compared. The article discusses that Khuri's method may be applied to obtain confidence intervals for the variance components and other linear functions of the expected mean squares used in generalizability theory. (Author/JAZ)

Descriptors: Analysis of Variance, Elementary Education, Equations (Mathematics), Error of Measurement

Validity Evidence in a University Group Oral Test

Peer reviewed

Direct link

Van Moere, Alistair – Language Testing, 2006

This article investigates a group oral test as administered at a university in Japan to find if it is appropriate to use scores for higher stakes decision making. It is one component of an in-house English proficiency test used for placing students, evaluating their progress, and making informed decisions for the development of the English…

Descriptors: Foreign Countries, Generalizability Theory, Achievement Tests, English (Second Language)

The Use of Generalizability (G) Theory in the Testing of Linguistic Minorities

Peer reviewed

Direct link

Solano-Flores, Guillermo; Li, Min – Educational Measurement: Issues and Practice, 2006

We contend that generalizability (G) theory allows the design of psychometric approaches to testing English-language learners (ELLs) that are consistent with current thinking in linguistics. We used G theory to estimate the amount of measurement error due to code (language or dialect). Fourth- and fifth-grade ELLs, native speakers of…

Descriptors: Foreign Countries, Grade 4, Grade 5, English (Second Language)

Previous Page | Next Page »

Pages: 1 | 2

Atilgan, Hakan	1
Bell, John F.	1
Bimpeh, Yaw	1
Burch, V. C.	1
Chis, Liliana	1
Chua, Boon Liang	1
Clauser, Brian E.	1
Dassa, Clement	1
Gebril, Atta	1
Gipps, Caroline V.	1
Hagtvet, Knut A.	1
Harik, Polina	1
Harrison, Liz	1
Leclerc, Bernard-Simon	1
Li, Min	1
Margolis, Melissa J.	1
McManus, I. C.	1
Mollon, Jennifer	1
Norman, G. R.	1
Pointer, William	1
Rantanen, Pekka	1
Sanders, Piet F.	1
Scherer, Ronny	1
Schmidt, H. G.	1
Smith, Ben Alexander	1
More ▼