ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	13

Descriptor

Error of Measurement	15
Generalizability Theory	15
Statistical Analysis	15
Correlation	6
Reliability	4
Interrater Reliability	3
Reading Tests	3
Sample Size	3
Scores	3
Test Items	3
Comparative Analysis	2
Data Analysis	2
Effect Size	2
Item Response Theory	2
Licensing Examinations…	2
Mathematics Tests	2
Measurement	2
Multivariate Analysis	2
Nonparametric Statistics	2
Reading Achievement	2
Scoring Rubrics	2
Statistical Significance	2
Test Construction	2
Test Theory	2
Ability	1
More ▼

Source

ETS Research Report Series	2
Educational and Psychological…	2
Applied Measurement in…	1
Assessment & Evaluation in…	1
Educational Research and…	1
Eurasian Journal of…	1
International Journal of…	1
Journal of Early Adolescence	1
Journal of Educational…	1
Journal of Educational and…	1
ProQuest LLC	1
Springer	1
More ▼

Publication Type

Journal Articles	12
Reports - Research	9
Reports - Evaluative	3
Books	1
Dissertations/Theses -…	1
Guides - Non-Classroom	1
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

Elementary Education	2
Grade 3	2
Grade 5	2
Grade 8	2
High Schools	2
Higher Education	2
Secondary Education	2
Elementary Secondary Education	1
Grade 1	1
Grade 10	1
Grade 4	1
Grade 6	1
Grade 7	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
More ▼

Audience

Researchers

Location

Netherlands	1
New York	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 15 results Save | Export

The Power and Type I Error of Wilcoxon-Mann-Whitney, Welch's "t," and Student's "t" Tests for Likert-Type Data

Peer reviewed
PDF on ERIC

Download full text

Simsek, Ahmet Salih – International Journal of Assessment Tools in Education, 2023

Likert-type item is the most popular response format for collecting data in social, educational, and psychological studies through scales or questionnaires. However, there is no consensus on whether parametric or non-parametric tests should be preferred when analyzing Likert-type data. This study examined the statistical power of parametric and…

Descriptors: Error of Measurement, Likert Scales, Nonparametric Statistics, Statistical Analysis

The Effects of Sample Size and Missing Data Rates on Generalizability Coefficients

Peer reviewed
PDF on ERIC

Download full text

Soysal, Sumeyra; Karaman, Haydar; Dogan, Nuri – Eurasian Journal of Educational Research, 2018

Purpose of the Study: Missing data are a common problem encountered while implementing measurement instruments. Yet the extent to which reliability, validity, average discrimination and difficulty of the test results are affected by the missing data has not been studied much. Since it is inevitable that missing data have an impact on the…

Descriptors: Sample Size, Data Analysis, Research Problems, Error of Measurement

The Reliability and Sources of Error of Using Rubrics-Based Assessment for Student Projects

Peer reviewed

Direct link

Menéndez-Varela, José-Luis; Gregori-Giralt, Eva – Assessment & Evaluation in Higher Education, 2018

Rubrics are widely used in higher education to assess performance in project-based learning environments. To date, the sources of error that may affect their reliability have not been studied in depth. Using generalisability theory as its starting-point, this article analyses the influence of the assessors and the criteria of the rubrics on the…

Descriptors: Scoring Rubrics, Student Projects, Active Learning, Reliability

An Information-Correction Method for Testlet-Based Test Analysis: From the Perspectives of Item Response Theory and Generalizability Theory. Research Report. ETS RR-17-27

Peer reviewed
PDF on ERIC

Download full text

Li, Feifei – ETS Research Report Series, 2017

An information-correction method for testlet-based tests is introduced. This method takes advantage of both generalizability theory (GT) and item response theory (IRT). The measurement error for the examinee proficiency parameter is often underestimated when a unidimensional conditional-independence IRT model is specified for a testlet dataset. By…

Descriptors: Item Response Theory, Generalizability Theory, Tests, Error of Measurement

Estimation of Error Components in Cohort Studies: A Cross-Cohort Analysis of Dutch Mathematics Achievement

Peer reviewed

Direct link

Keuning, Jos; Hemker, Bas – Educational Research and Evaluation, 2014

The data collection of a cohort study requires making many decisions. Each decision may introduce error in the statistical analyses conducted later on. In the present study, a procedure was developed for estimation of the error made due to the composition of the sample, the item selection procedure, and the test equating process. The math results…

Descriptors: Foreign Countries, Cohort Analysis, Statistical Analysis, Error of Measurement

Generalizability Theory as a Unifying Framework of Measurement Reliability in Adolescent Research

Peer reviewed

Direct link

Fan, Xitao; Sun, Shaojing – Journal of Early Adolescence, 2014

In adolescence research, the treatment of measurement reliability is often fragmented, and it is not always clear how different reliability coefficients are related. We show that generalizability theory (G-theory) is a comprehensive framework of measurement reliability, encompassing all other reliability methods (e.g., Pearson "r,"…

Descriptors: Generalizability Theory, Measurement, Reliability, Correlation

An Application of Generalizability Theory to Evaluate the Technical Quality of an Alternate Assessment

Peer reviewed

Direct link

Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013

Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…

Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores

Oral Performace Scoring Using Generalizability Theory and Many-Facet Rasch Measurement: A Comparison Study

Direct link

Alkahtani, Saif F. – ProQuest LLC, 2012

The principal aim of the present study was to better guide the Quranic recitation appraisal practice by presenting an application of Generalizability theory and Many-facet Rasch Measurement Model for assessing the dependability and fit of two suggested rubrics. Recitations of 93 students were rated holistically and analytically by 3 independent…

Descriptors: Generalizability Theory, Item Response Theory, Verbal Tests, Islam

Measuring Test Measurement Error: A General Approach

Peer reviewed

Direct link

Boyd, Donald; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – Journal of Educational and Behavioral Statistics, 2013

Test-based accountability as well as value-added asessments and much experimental and quasi-experimental research in education rely on achievement tests to measure student skills and knowledge. Yet, we know little regarding fundamental properties of these tests, an important example being the extent of measurement error and its implications for…

Descriptors: Accountability, Educational Research, Educational Testing, Error of Measurement

Unbiased Estimates of Variance Components with Bootstrap Procedures

Peer reviewed

Direct link

Brennan, Robert L. – Educational and Psychological Measurement, 2007

This article provides general procedures for obtaining unbiased estimates of variance components for any random-model balanced design under any bootstrap sampling plan, with the focus on designs of the type typically used in generalizability theory. The results reported here are particularly helpful when the bootstrap is used to estimate standard…

Descriptors: Generalizability Theory, Error of Measurement, Statistical Analysis

Statistics and Data Interpretation for Social Work

Direct link

Rosenthal, James A. – Springer, 2011

Written by a social worker for social work students, this is a nuts and bolts guide to statistics that presents complex calculations and concepts in clear, easy-to-understand language. It includes numerous examples, data sets, and issues that students will encounter in social work practice. The first section introduces basic concepts and terms to…

Descriptors: Statistics, Data Interpretation, Social Work, Social Science Research

Reliability and the Nonequivalent Groups with Anchor Test Design. Research Report. ETS RR-07-16

Peer reviewed
PDF on ERIC

Download full text

Moses, Tim; Kim, Sooyeon – ETS Research Report Series, 2007

This study evaluated the impact of unequal reliability on test equating methods in the nonequivalent groups with anchor test (NEAT) design. Classical true score-based models were compared in terms of their assumptions about how reliability impacts test scores. These models were related to treatment of population ability differences by different…

Descriptors: Reliability, Equated Scores, Test Items, Statistical Analysis

A Multivariate Generalizability Analysis of the Multistate Bar Examination

Peer reviewed

Direct link

Yin, Ping – Educational and Psychological Measurement, 2005

The main purpose of this study is to examine the content structure of the Multistate Bar Examination (MBE) using the "table of specifications" model from the perspective of multivariate generalizability theory. Specifically, using MBE data collected over different years (six administrations: three from the February test and three from July test),…

Descriptors: Correlation, Generalizability Theory, Statistical Analysis, Multivariate Analysis

A Generalizability Study of the Angoff Method Applied to Setting Cutoff Scores of Professional Certification Tests.

Cope, Ronald T. – 1987

This study used generalizability theory and other statistical concepts to assess the application of the Angoff method to setting cutoff scores on two professional certification tests. A panel of ten judges gave pre- and post-feedback Angoff probability ratings of items of two forms of a professional certification test, and another panel of nine…

Descriptors: Certification, Correlation, Cutting Scores, Error of Measurement

Examining the Reliability of Running Records: Attaining Generalizable Results

Peer reviewed

Direct link

Fawson, Parker C.; Ludlow, Brian C.; Reutzel, D. Ray; Sudweeks, Richard; Smith, John A. – Journal of Educational Research, 2006

The authors present results of a generalizability study of running record assessment. They conducted 2 decision studies to ascertain the number of raters and passages necessary to obtain a reliable estimate of a student's reading ability on the basis of a running record assessment. Ten teachers completed running record assessments of 10…

Descriptors: Reading Ability, Generalizability Theory, Reading Instruction, Error of Measurement

Alkahtani, Saif F.	1
Boyd, Donald	1
Brennan, Robert L.	1
Cope, Ronald T.	1
Dogan, Nuri	1
Fan, Xitao	1
Fawson, Parker C.	1
Gregori-Giralt, Eva	1
Hemker, Bas	1
Karaman, Haydar	1
Keuning, Jos	1
Kim, Sooyeon	1
Lankford, Hamilton	1
Li, Feifei	1
Loeb, Susanna	1
Ludlow, Brian C.	1
Menéndez-Varela, José-Luis	1
Moses, Tim	1
Pastor, Dena A.	1
Reutzel, D. Ray	1
Rosenthal, James A.	1
Simsek, Ahmet Salih	1
Smith, John A.	1
Soysal, Sumeyra	1
Sudweeks, Richard	1
More ▼