ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	16

Descriptor

Generalizability Theory	26
Interrater Reliability	26
Scores	26
Test Reliability	13
Error of Measurement	8
Scoring	8
Performance Based Assessment	6
Evaluators	5
Graduate Students	4
Standardized Tests	4
Test Validity	4
Writing Evaluation	4
Academic Achievement	3
Accuracy	3
Certification	3
Essay Tests	3
Evaluation Methods	3
Foreign Countries	3
Higher Education	3
Language Tests	3
Reliability	3
Sampling	3
Scoring Rubrics	3
Test Items	3
Comparative Analysis	2
More ▼

Source

Assessment for Effective…	2
Language Assessment Quarterly	2
Language Testing	2
Advances in Health Sciences…	1
Asian Journal of Education…	1
Assessing Writing	1
ETS Research Report Series	1
Educational Researcher	1
Educational Sciences: Theory…	1
Evaluation and the Health…	1
Journal of Educational and…	1
Journal of MultiDisciplinary…	1
Journal of Psychoeducational…	1
Multivariate Behavioral…	1
Psychology in the Schools	1
Reading Psychology	1
Research & Practice in…	1
More ▼

Publication Type

Journal Articles	20
Reports - Research	17
Reports - Evaluative	6
Speeches/Meeting Papers	4
Reports - Descriptive	3
Tests/Questionnaires	2

Education Level

Higher Education	7
Postsecondary Education	5
Early Childhood Education	1
Elementary Education	1
Elementary Secondary Education	1
Grade 1	1
Grade 8	1
Grade 9	1
Primary Education	1

Audience

Researchers

Location

Turkey (Ankara)	2
China (Beijing)	1
Idaho	1
Oklahoma	1

Laws, Policies, & Programs

Assessments and Surveys

United States Medical…

What Works Clearinghouse Rating

Showing 1 to 15 of 26 results Save | Export

The Generalizability of Running Record Accuracy and Self-Correction Scores

Peer reviewed

Direct link

D'Agostino, Jerome V.; Rodgers, Emily; Winkler, Christa; Johnson, Tracy; Berenbon, Rebecca – Reading Psychology, 2021

Running Records provide a standardized method for recording and assessing students' oral reading behaviors and are excellent formative assessment tools to guide instructional decision-making. This study expands on prior Running Record reliability work by evaluating the extent to which external raters and teachers consistently assessed students'…

Descriptors: Accuracy, Oral Reading, Generalizability Theory, Error Correction

Examining the Reliability of Scores from a Performance Assessment of Practice-Based Competencies

Peer reviewed

Direct link

Roduta Roberts, Mary; Alves, Cecilia Brito; Werther, Karin; Bahry, Louise M. – Journal of Psychoeducational Assessment, 2019

The purpose of this study was to examine the reliability and sources of score variation from a performance assessment of practice competencies within an occupational therapy program. Data from 99 students who participated in a practical exam were examined. A generalizability analysis of analytic, total, and overall holistic scores was completed…

Descriptors: Performance Based Assessment, Test Reliability, Scores, Occupational Therapy

A Generalizability Theory Study to Examine Sources of Score Variance in Third-Party Evaluations Used in Decision-Making for Graduate School Admissions. ETS GRE® Board Research Report. ETS GRE®-18-03. ETS RR-18-37

Peer reviewed
PDF on ERIC

Download full text

McCaffrey, Daniel F.; Oliveri, Maria Elena; Holtzman, Steven – ETS Research Report Series, 2018

Scores from noncognitive measures are increasingly valued for their utility in helping to inform postsecondary admissions decisions. However, their use has presented challenges because of faking, response biases, or subjectivity, which standardized third-party evaluations (TPEs) can help minimize. Analysts and researchers using TPEs, however, need…

Descriptors: Generalizability Theory, Scores, College Admission, Admission Criteria

The Exchangeability of Brief Intelligence Tests for Children with Intellectual Giftedness: Illuminating Error Variance Components' Influence on IQs

Peer reviewed

Direct link

Irby, Sarah M.; Floyd, Randy G. – Psychology in the Schools, 2017

This study examined the exchangeability of total scores (i.e., intelligent quotients [IQs]) from three brief intelligence tests. Tests were administered to 36 children with intellectual giftedness, scored live by one set of primary examiners and later scored by a secondary examiner. For each student, six IQs were calculated, and all 216 values…

Descriptors: Intelligence Tests, Gifted, Error of Measurement, Scores

Using Generalizability Theory to Assess the Score Reliability of Communication Skills of Dentistry Students

Peer reviewed
PDF on ERIC

Download full text

Uzun, N. Bilge; Aktas, Mehtap; Asiret, Semih; Yormaz, Seha – Asian Journal of Education and Training, 2018

The goal of this study is to determine the reliability of the performance points of dentistry students regarding communication skills and to examine the scoring reliability by generalizability theory in balanced random and fixed facet (mixed design) data, considering also the interactions of student, rater and duty. The study group of the research…

Descriptors: Foreign Countries, Generalizability Theory, Scores, Test Reliability

Working with Sparse Data in Rated Language Tests: Generalizability Theory Applications

Peer reviewed

Direct link

Lin, Chih-Kai – Language Testing, 2017

Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…

Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy

Inter-Rater Reliability and Generalizability of Patient Note Scores Using a Scoring Rubric Based on the USMLE Step-2 CS Format

Peer reviewed

Direct link

Park, Yoon Soo; Hyderi, Abbas; Bordage, Georges; Xing, Kuan; Yudkowsky, Rachel – Advances in Health Sciences Education, 2016

Recent changes to the patient note (PN) format of the United States Medical Licensing Examination have challenged medical schools to improve the instruction and assessment of students taking the Step-2 clinical skills examination. The purpose of this study was to gather validity evidence regarding response process and internal structure, focusing…

Descriptors: Interrater Reliability, Generalizability Theory, Licensing Examinations (Professions), Physicians

A Comparison of Newly-Trained and Experienced Raters on a Standardized Writing Assessment

Peer reviewed

Direct link

Attali, Yigal – Language Testing, 2016

A short training program for evaluating responses to an essay writing task consisted of scoring 20 training essays with immediate feedback about the correct score. The same scoring session also served as a certification test for trainees. Participants with little or no previous rating experience completed this session and 14 trainees who passed an…

Descriptors: Writing Evaluation, Writing Tests, Standardized Tests, Evaluators

Measuring Rater Reliability on a Special Education Observation Tool

Peer reviewed

Direct link

Semmelroth, Carrie Lisa; Johnson, Evelyn – Assessment for Effective Intervention, 2014

This study used generalizability theory to measure reliability on the Recognizing Effective Special Education Teachers (RESET) observation tool designed to evaluate special education teacher effectiveness. At the time of this study, the RESET tool included three evidence-based instructional practices (direct, explicit instruction; whole-group…

Descriptors: Observation, Special Education Teachers, Teacher Effectiveness, Teacher Evaluation

Investigating Score Dependability in English/Chinese Interpreter Certification Performance Testing: A Generalizability Theory Approach

Peer reviewed

Direct link

Han, Chao – Language Assessment Quarterly, 2016

As a property of test scores, reliability/dependability constitutes an important psychometric consideration, and it underpins the validity of measurement results. A review of interpreter certification performance tests (ICPTs) reveals that (a) although reliability/dependability checking has been recognized as an important concern, its theoretical…

Descriptors: Foreign Countries, Scores, English, Chinese

Utilizing Generalizability Theory to Investigate the Reliability of the Grades Assigned to Undergraduate Research Papers

Peer reviewed

Direct link

Gugiu, Mihaiela R.; Gugiu, Paul C.; Baldus, Robert – Journal of MultiDisciplinary Evaluation, 2012

Background: Educational researchers have long espoused the virtues of writing with regard to student cognitive skills. However, research on the reliability of the grades assigned to written papers reveals a high degree of contradiction, with some researchers concluding that the grades assigned are very reliable whereas others suggesting that they…

Descriptors: Grades (Scholastic), Grading, Scoring Rubrics, Research Design

When Rater Reliability Is Not Enough: Teacher Observation Systems and a Case for the Generalizability Study

Peer reviewed

Direct link

Hill, Heather C.; Charalambous, Charalambos Y.; Kraft, Matthew A. – Educational Researcher, 2012

In recent years, interest has grown in using classroom observation as a means to several ends, including teacher development, teacher evaluation, and impact evaluation of classroom-based interventions. Although education practitioners and researchers have developed numerous observational instruments for these purposes, many developers fail to…

Descriptors: Generalizability Theory, Observation, Classroom Observation Techniques, Evaluation

Studying Reliability of Open Ended Mathematics Items According to the Classical Test Theory and Generalizability Theory

Peer reviewed
PDF on ERIC

Download full text

Guler, Nese; Gelbal, Selahattin – Educational Sciences: Theory and Practice, 2010

In this study, the Classical test theory and generalizability theory were used for determination to reliability of scores obtained from measurement tool of mathematics success. 24 open-ended mathematics question of the TIMSS-1999 was applied to 203 students in 2007-spring semester. Internal consistency of scores was found as 0.92. For…

Descriptors: Generalizability Theory, Test Theory, Test Reliability, Interrater Reliability

Generalizability of Student Writing across Multiple Tasks: A Challenge for Authentic Assessment

Peer reviewed
PDF on ERIC

Download full text

Hathcoat, John D.; Penn, Jeremy D. – Research & Practice in Assessment, 2012

Critics of standardized testing have recommended replacing standardized tests with more authentic assessment measures, such as classroom assignments, projects, or portfolios rated by a panel of raters using common rubrics. Little research has examined the consistency of scores across multiple authentic assignments or the implications of this…

Descriptors: Generalizability Theory, Performance Based Assessment, Writing Across the Curriculum, Standardized Tests

Bringing Reading-to-Write and Writing-Only Assessment Tasks Together: A Generalizability Analysis

Peer reviewed

Direct link

Gebril, Atta – Assessing Writing, 2010

Integrated tasks are currently employed in a number of L2 exams since they are perceived as an addition to the writing-only task type. Given this trend, the current study investigates composite score generalizability of both reading-to-write and writing-only tasks. For this purpose, a multivariate generalizability analysis is used to investigate…

Descriptors: Scoring, Scores, Second Language Instruction, Writing Evaluation

Previous Page | Next Page »

Pages: 1 | 2

Abedi, Jamal	1
Aktas, Mehtap	1
Alves, Cecilia Brito	1
Asiret, Semih	1
Attali, Yigal	1
Bahry, Louise M.	1
Baldus, Robert	1
Bennett, Randy Elliot	1
Berenbon, Rebecca	1
Bordage, Georges	1
Bunch, Michael B.	1
Charalambous, Charalambos Y.	1
Chen, Michael	1
D'Agostino, Jerome V.	1
Fan, Xitao	1
Fisher, Steven P.	1
Floyd, Randy G.	1
Gebril, Atta	1
Gelbal, Selahattin	1
Gordon, Belita	1
Gugiu, Mihaiela R.	1
Gugiu, Paul C.	1
Guler, Nese	1
Han, Chao	1
More ▼