ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	9

Descriptor

Evaluation Methods	11
Generalizability Theory	11
Scores	11
Reliability	6
Rating Scales	4
Error of Measurement	3
Evaluators	3
Interrater Reliability	3
Scoring	3
Behavior Problems	2
Elementary School Students	2
Language Tests	2
Measures (Individuals)	2
Performance Based Assessment	2
Second Language Learning	2
Student Behavior	2
Test Reliability	2
Writing Skills	2
Accuracy	1
Classroom Observation…	1
Cognitive Style	1
College Admission	1
College Applicants	1
Computer Software	1
Computer Software Evaluation	1
More ▼

Source

School Psychology Review	2
Educational and Psychological…	1
Grantee Submission	1
International Journal of…	1
Journal of College Admission	1
Language Testing	1
Language Testing in Asia	1
Measurement and Evaluation in…	1
Multivariate Behavioral…	1
Research & Practice in…	1

Publication Type

Journal Articles	11
Reports - Research	10
Reports - Descriptive	1

Education Level

Elementary Education	2
Higher Education	2
Early Childhood Education	1
Postsecondary Education	1
Preschool Education	1

Audience

Location

Oklahoma

Laws, Policies, & Programs

Assessments and Surveys

Behavior Assessment System…	1
Myers Briggs Type Indicator	1
Teacher Rating Scale	1

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Using Generalizability Theory to Examine the Dependability of Scores from the Learning Target Rating Scale

Peer reviewed
PDF on ERIC

Download full text

Direct link

McLaughlin, Tara W.; Snyder, Patricia A.; Algina, James – Grantee Submission, 2017

The Learning Target Rating Scale (LTRS) is a measure designed to evaluate the quality of teacher-developed learning targets for embedded instruction for early learning. In the present study, we examined the measurement dependability of LTRS scores by conducting a generalizability study (G-study). We used a partially nested, three-facet model to…

Descriptors: Generalizability Theory, Scores, Rating Scales, Evaluation Methods

Working with Sparse Data in Rated Language Tests: Generalizability Theory Applications

Peer reviewed

Direct link

Lin, Chih-Kai – Language Testing, 2017

Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…

Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy

Rater Reliability and Score Discrepancy under Holistic and Analytic Scoring of Second Language Writing

Peer reviewed

Direct link

Zhang, Bo; Xiao, Yunnan; Luo, Juan – Language Testing in Asia, 2015

Previous studies comparing holistic scoring to analytic scoring of second language writing have given mixed results. Some of them suffer from methodological drawbacks, such as limited writing sample size, limited number of raters, and lack of direct comparison of the two methods. Based on 300 writing samples graded by 14 raters, this research…

Descriptors: Evaluators, Reliability, Scores, Holistic Approach

A Ranking Method for Evaluating Constructed Responses

Peer reviewed

Direct link

Attali, Yigal – Educational and Psychological Measurement, 2014

This article presents a comparative judgment approach for holistically scored constructed response tasks. In this approach, the grader rank orders (rather than rate) the quality of a small set of responses. A prior automated evaluation of responses guides both set formation and scaling of rankings. Sets are formed to have similar prior scores and…

Descriptors: Responses, Item Response Theory, Scores, Rating Scales

Generalizability of Student Writing across Multiple Tasks: A Challenge for Authentic Assessment

Peer reviewed
PDF on ERIC

Download full text

Hathcoat, John D.; Penn, Jeremy D. – Research & Practice in Assessment, 2012

Critics of standardized testing have recommended replacing standardized tests with more authentic assessment measures, such as classroom assignments, projects, or portfolios rated by a panel of raters using common rubrics. Little research has examined the consistency of scores across multiple authentic assignments or the implications of this…

Descriptors: Generalizability Theory, Performance Based Assessment, Writing Across the Curriculum, Standardized Tests

Generalizability of Classroom Behavior Problem and On-Task Scores from the Direct Observation Form

Peer reviewed

Direct link

Volpe, Robert J.; McConaughy, Stephanie H.; Hintze, John M. – School Psychology Review, 2009

The present study used generalizability theory to investigate the dependability of systematic observations of students' problem behavior and on-task behavior in classrooms. The Direct Observation Form (McConaughy & Achenbach, 2009) was used with a sample of 24, 6- to-11-year-old children attending 18 different elementary schools. The participants…

Descriptors: Generalizability Theory, Behavior Problems, Student Behavior, Evaluation Methods

The Generalizability of Externalizing Behavior Composites and Subscale Scores across Time, Rater, and Instrument

Peer reviewed

Direct link

Bergeron, Renee; Floyd, Randy G.; McCormack, Allison C.; Farmer, William L. – School Psychology Review, 2008

The dependability of externalizing behavior composites and subscale scores from the Behavior Assessment System for Children, Second Edition, Teacher Rating Scale-Child (Reynolds & Kamphaus, 2004) and the Achenbach System of Empirically Based Assessment, Teacher's Report Form for Ages 6-18 (Achenbach & Rescorla, 2001) was investigated.…

Descriptors: Generalizability Theory, Scores, Rating Scales, Error of Measurement

Evaluating Prototype Tasks and Alternative Rating Schemes for a New ESL Writing Test through G-Theory

Peer reviewed

Direct link

Lee, Yong-Won; Kantor, Robert – International Journal of Testing, 2007

Possible integrated and independent tasks were pilot tested for the writing section of a new generation of the TOEFL[R] (Test of English as a Foreign Language[TM]). This study examines the impact of various rating designs and of the number of tasks and raters on the reliability of writing scores based on integrated and independent tasks from the…

Descriptors: Generalizability Theory, Writing Tests, English (Second Language), Second Language Learning

Assessing the Reliability of Ratings Used in Undergraduate Admission Decisions

Peer reviewed
PDF on ERIC

Download full text

Kretchmar, Jennifer – Journal of College Admission, 2006

Many colleges and universities receive thousands of applications for freshman admission every year. To facilitate the process of evaluating each and every applicant in a relatively short amount of time, schools often devise quantitative ratings scales to summarize student characteristics. The ratings give readers a shorthand way to communicate the…

Descriptors: Generalizability Theory, Reliability, College Admission, College Applicants

Two Approaches to Examining the Stability of Myers-Briggs Type Indicator Scores

Peer reviewed

Salter, Daniel W.; Forney, Deanna S.; Evans, Nancy J. – Measurement and Evaluation in Counseling and Development, 2005

In this study, two approaches are used to assess the stability of Myers-Briggs Type Indicator scores across 3 administrations (N = 231): longitudinal configural frequency analysis with categorical scores and generalizability theory with the Preference Clarity Indices and continuous scores. The results are generally positive. Evaluation of…

Descriptors: Psychology, Cognitive Style, Generalizability Theory, Personality Traits

Interrater/Test Reliability System (ITRS).

Peer reviewed

Abedi, Jamal – Multivariate Behavioral Research, 1996

The Interrater/Test Reliability System (ITRS) is described. The ITRS is a comprehensive computer tool used to address questions of interrater reliability that computes several different indices of interrater reliability and the generalizability coefficient over raters and topics. The system is available in IBM compatible or Macintosh format. (SLD)

Descriptors: Computer Software, Computer Software Evaluation, Evaluation Methods, Evaluators

Abedi, Jamal	1
Algina, James	1
Attali, Yigal	1
Bergeron, Renee	1
Evans, Nancy J.	1
Farmer, William L.	1
Floyd, Randy G.	1
Forney, Deanna S.	1
Hathcoat, John D.	1
Hintze, John M.	1
Kantor, Robert	1
Kretchmar, Jennifer	1
Lee, Yong-Won	1
Lin, Chih-Kai	1
Luo, Juan	1
McConaughy, Stephanie H.	1
McCormack, Allison C.	1
McLaughlin, Tara W.	1
Penn, Jeremy D.	1
Salter, Daniel W.	1
Snyder, Patricia A.	1
Volpe, Robert J.	1
Xiao, Yunnan	1
Zhang, Bo	1
More ▼