ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	16

Descriptor

Reliability	64
True Scores	64
Error of Measurement	30
Correlation	21
Statistical Analysis	18
Mathematical Models	14
Analysis of Variance	11
Measurement Techniques	10
Estimation (Mathematics)	8
Item Response Theory	8
Sampling	8
Classification	7
Equations (Mathematics)	7
Measurement	7
Raw Scores	7
Scores	7
Test Construction	7
Test Reliability	7
Criterion Referenced Tests	6
Probability	6
Test Theory	6
Comparative Analysis	5
Computation	5
Cutting Scores	5
Equated Scores	5
More ▼

Publication Type

Journal Articles	34
Reports - Evaluative	21
Reports - Research	21
Speeches/Meeting Papers	8
Reports - Descriptive	6
Numerical/Quantitative Data	2
Guides - Non-Classroom	1
Opinion Papers	1

Education Level

Elementary Secondary Education	2
Secondary Education	2
High Schools	1
Higher Education	1

Audience

Researchers

Location

Australia	1
China	1
United Kingdom (England)	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
National Longitudinal Study…	1
SAT (College Admission Test)	1
Work Keys (ACT)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 64 results Save | Export

Improving Peer Assessment Accuracy by Incorporating Relative Peer Grades

Peer reviewed
PDF on ERIC

Download full text

Wang, Tianqi; Jing, Xia; Li, Qi; Gao, Jing; Tang, Jie – International Educational Data Mining Society, 2019

Massive Open Online Courses (MOOCs) have become more and more popular recently. These courses have attracted a large number of students world-wide. In a popular course, there may be thousands of students. Such a large number of students in one course makes it infeasible for the instructors to grade all the submissions. Peer assessment is thus an…

Descriptors: Peer Evaluation, Accuracy, Grades (Scholastic), Grading

The Importance of the Assumption of Uncorrelated Errors in Psychometric Theory

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A.; Patelis, Thanos – Educational and Psychological Measurement, 2015

A critical discussion of the assumption of uncorrelated errors in classical psychometric theory and its applications is provided. It is pointed out that this assumption is essential for a number of fundamental results and underlies the concept of parallel tests, the Spearman-Brown's prophecy and the correction for attenuation formulas as well as…

Descriptors: Psychometrics, Correlation, Validity, Reliability

The Accuracy and Consistency of a Series of IRT True Score Equatings

Peer reviewed

Direct link

Li, Deping; Jiang, Yanlin; von Davier, Alina A. – Journal of Educational Measurement, 2012

This study investigates a sequence of item response theory (IRT) true score equatings based on various scale transformation approaches and evaluates equating accuracy and consistency over time. The results show that the biases and sample variances for the IRT true score equating (both direct and indirect) are quite small (except for the mean/sigma…

Descriptors: True Scores, Equated Scores, Item Response Theory, Accuracy

Relationships of Measurement Error and Prediction Error in Observed-Score Regression

Peer reviewed

Direct link

Moses, Tim – Journal of Educational Measurement, 2012

The focus of this paper is assessing the impact of measurement errors on the prediction error of an observed-score regression. Measures are presented and described for decomposing the linear regression's prediction error variance into parts attributable to the true score variance and the error variances of the dependent variable and the predictor…

Descriptors: Error of Measurement, Prediction, Regression (Statistics), True Scores

How Does the Knowledge of Subgroup Membership of Examinees Affect the Prediction of True Subscores? Research Report. ETS RR-11-43

Download full text

Haberman, Shelby J.; Sinharay, Sandip – Educational Testing Service, 2011

Subscores are reported for several operational assessments. Haberman (2008) suggested a method based on classical test theory to determine if the true subscore is predicted better by the corresponding subscore or the total score. Researchers are often interested in learning how different subgroups perform on subtests. Stricker (1993) and…

Descriptors: True Scores, Test Theory, Prediction, Group Membership

Reliability Generalization: An Examination of the Positive Affect and Negative Affect Schedule

Peer reviewed

Direct link

Leue, Anja; Lange, Sebastian – Assessment, 2011

The assessment of positive affect (PA) and negative affect (NA) by means of the Positive Affect and Negative Affect Schedule has received a remarkable popularity in the social sciences. Using a meta-analytic tool--namely, reliability generalization (RG)--population reliability scores of both scales have been investigated on the basis of a random…

Descriptors: Social Sciences, True Scores, Generalization, Affective Behavior

Coping with Memory Effect and Serial Correlation when Estimating Reliability in a Longitudinal Framework

Peer reviewed

Direct link

Laenen, Annouschka; Alonso, Ariel; Molenberghs, Geert; Vangeneugden, Tony; Mallinckrodt, Craig H. – Applied Psychological Measurement, 2010

Longitudinal studies are permeating clinical trials in psychiatry. Therefore, it is of utmost importance to study the psychometric properties of rating scales, frequently used in these trials, within a longitudinal framework. However, intrasubject serial correlation and memory effects are problematic issues often encountered in longitudinal data.…

Descriptors: Psychiatry, Rating Scales, Memory, Psychometrics

A Response to an Article Published in "Educational Research"'s Special Issue on Assessment (June 2009). What Can Be Inferred about Classification Accuracy from Classification Consistency?

Peer reviewed

Direct link

Bramley, Tom – Educational Research, 2010

Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…

Descriptors: National Curriculum, Educational Research, Testing, Measurement

Reliability and Attribute-Based Scoring in Cognitive Diagnostic Assessment

Peer reviewed

Direct link

Gierl, Mark J.; Cui, Ying; Zhou, Jiawen – Journal of Educational Measurement, 2009

The attribute hierarchy method (AHM) is a psychometric procedure for classifying examinees' test item responses into a set of structured attribute patterns associated with different components from a cognitive model of task performance. Results from an AHM analysis yield information on examinees' cognitive strengths and weaknesses. Hence, the AHM…

Descriptors: Test Items, True Scores, Psychometrics, Algebra

A Modification to Angoff and Bookmarking Cut Scores to Account for the Imperfect Reliability of Test Scores

Peer reviewed

Direct link

MacCann, Robert G. – Educational and Psychological Measurement, 2008

It is shown that the Angoff and bookmarking cut scores are examples of true score equating that in the real world must be applied to observed scores. In the context of defining minimal competency, the percentage "failed" by such methods is a function of the length of the measuring instrument. It is argued that this length is largely…

Descriptors: True Scores, Cutting Scores, Minimum Competencies, Scores

A Measure for the Reliability of a Rating Scale Based on Longitudinal Clinical Trial Data

Peer reviewed

Direct link

Laenen, Annouschka; Alonso, Ariel; Molenberghs, Geert – Psychometrika, 2007

A new measure for reliability of a rating scale is introduced, based on the classical definition of reliability, as the ratio of the true score variance and the total variance. Clinical trial data can be employed to estimate the reliability of the scale in use, whenever repeated measurements are taken. The reliability is estimated from the…

Descriptors: Schizophrenia, Rating Scales, Likert Scales, True Scores

Undesired Variance Due to Examiner Stringency/Leniency Effect in Communication Skill Scores Assessed in OSCEs

Peer reviewed

Direct link

Harasym, Peter H.; Woloschuk, Wayne; Cunning, Leslie – Advances in Health Sciences Education, 2008

Physician-patient communication is a clinical skill that can be learned and has a positive impact on patient satisfaction and health outcomes. A concerted effort at all medical schools is now directed at teaching and evaluating this core skill. Student communication skills are often assessed by an Objective Structure Clinical Examination (OSCE).…

Descriptors: Medical Schools, Family Practice (Medicine), Examiners, Error of Measurement

Measuring Marbles: Demonstrating the Basic Tenets of Measurement Theory

Peer reviewed

Direct link

Wininger, Steven R. – Teaching Statistics: An International Journal for Teachers, 2007

A hands-on activity is described in which students attempt to measure something that they cannot see. In small groups, students estimate the number of marbles in sealed boxes. Next, students' estimates are compared with the actual numbers. Last, values from both the students' estimates and actual numbers are used to explain measurement theory and…

Descriptors: Computation, Measurement, Experiential Learning, Theories

Subscores and Validity. Research Report. ETS RR-08-64

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J. – ETS Research Report Series, 2008

In educational testing, subscores may be provided based on a portion of the items from a larger test. One consideration in evaluation of such subscores is their ability to predict a criterion score. Two limitations on prediction exist. The first, which is well known, is that the coefficient of determination for linear prediction of the criterion…

Descriptors: Scores, Validity, Educational Testing, Correlation

Peer reviewed

Green, Samuel B.; Hershberger, Scott L. – Structural Equation Modeling, 2000

Proposes true score models that can account for correlated errors and their effect on coefficient alpha. These models allow random measurement errors on earlier items to affect directly or indirectly the scores on later items. Conditions under which coefficient alpha may yield spuriously high estimates or reliability are discussed. (SLD)

Descriptors: Correlation, Error of Measurement, Reliability, True Scores

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Journal of Educational…	8
Educational and Psychological…	7
Applied Psychological…	4
Applied Measurement in…	2
Assessment	2
ETS Research Report Series	2
Psychometrika	2
Advances in Health Sciences…	1
Canadian Journal of Program…	1
Child Development	1
Educational Research	1
Educational Testing Service	1
International Educational…	1
Journal of Applied Measurement	1
Journal of Educational…	1
Journal of Educational and…	1
Journal of Experimental…	1
Multivariate Behavioral…	1
Scandinavian Journal of…	1
Structural Equation Modeling	1
Teaching Statistics: An…	1
Test Service Bulletin	1
More ▼

Livingston, Samuel A.	4
Dimitrov, Dimiter M.	3
Alonso, Ariel	2
Brennan, Robert L.	2
Edwards, Keith J.	2
Haberman, Shelby J.	2
Hanson, Bradley A.	2
Laenen, Annouschka	2
Molenberghs, Geert	2
Moses, Tim	2
Shaw, Dale G.	2
Werts, C. E.	2
Bramley, Tom	1
Cizek, Gregory J.	1
Cook, William L.	1
Cragnolino, Ana	1
Cui, Ying	1
Cunning, Leslie	1
Dick, Walter	1
Dickinson, Terry L.	1
Doppelt, Jerome E.	1
Evans, Brian	1
Feldt, Leonard S.	1
Fisicaro, Sebastiano A.	1
More ▼