ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	4

Descriptor

Error of Measurement	13
Reliability	13
Test Construction	13
Scores	9
Statistical Analysis	4
Correlation	3
Measurement Techniques	3
Sample Size	3
Sampling	3
Test Items	3
Comparative Analysis	2
Equated Scores	2
Evaluation Methods	2
Generalizability Theory	2
Goodness of Fit	2
Item Analysis	2
Mathematics Tests	2
Measurement	2
Psychometrics	2
Surveys	2
Test Format	2
Test Interpretation	2
True Scores	2
Validity	2
Academic Achievement	1
More ▼

Source

Applied Measurement in…	1
Assessment & Evaluation in…	1
ETS Research Report Series	1
Educational Testing Service	1
Educational and Psychological…	1
Multivariate Behavioral…	1
Research in the Schools	1

Publication Type

Journal Articles	6
Reports - Research	5
Reports - Descriptive	3
Reports - Evaluative	2
Speeches/Meeting Papers	2
Books	1
Guides - Non-Classroom	1
Tests/Questionnaires	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Researchers	1
Students	1

Location

Arkansas	1
Portugal	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 13 results Save | Export

ETS Psychometric Contributions: Focus on Test Scores. Research Report. ETS RR-13-15. ETS R&D Scientific and Policy Contributions Series. ETS SPC-13-03

Peer reviewed
PDF on ERIC

Download full text

Moses, Tim – ETS Research Report Series, 2013

The purpose of this report is to review ETS psychometric contributions that focus on test scores. Two major sections review contributions based on assessing test scores' measurement characteristics and other contributions about using test scores as predictors in correlational and regression relationships. An additional section reviews additional…

Descriptors: Psychometrics, Scores, Correlation, Regression (Statistics)

Sources of Score Scale Inconsistency. Research Report. ETS RR-11-10

Download full text

Haberman, Shelby J.; Dorans, Neil J. – Educational Testing Service, 2011

For testing programs that administer multiple forms within a year and across years, score equating is used to ensure that scores can be used interchangeably. In an ideal world, samples sizes are large and representative of populations that hardly change over time, and very reliable alternate test forms are built with nearly identical psychometric…

Descriptors: Scores, Reliability, Equated Scores, Test Construction

The Impact of Incorrect Responses to Reverse-Coded Survey Items

Peer reviewed

Direct link

Hughes, Gail D. – Research in the Schools, 2009

The impacts of incorrect responses to reverse-coded survey items were examined in this simulation study by reversing responses to traditional Likert-format items from 700 administrators in randomly selected schools in a 7-county region in central Arkansas that were obtained from an archival dataset. Specifically, the number of reverse-coded items…

Descriptors: Surveys, Coding, Context Effect, Measures (Individuals)

Analytic Estimation of Standard Error and Confidence Interval for Scale Reliability.

Peer reviewed

Raykov, Tenko – Multivariate Behavioral Research, 2002

Proposes an analytic approach to standard error and confidence interval estimation of scale reliability with fixed congeneric measures. The method is based on a generally applicable estimator stability evaluation procedure, the delta method. The approach, which combines wide-spread point estimation of composite reliability in behavioral scale…

Descriptors: Error of Measurement, Estimation (Mathematics), Rating Scales, Reliability

E-Assessment within the Bologna Paradigm: Evidence from Portugal

Peer reviewed

Direct link

Ferrao, Maria – Assessment & Evaluation in Higher Education, 2010

The Bologna Declaration brought reforms into higher education that imply changes in teaching methods, didactic materials and textbooks, infrastructures and laboratories, etc. Statistics and mathematics are disciplines that traditionally have the worst success rates, particularly in non-mathematics core curricula courses. This research project,…

Descriptors: Foreign Countries, Computer Assisted Testing, Educational Technology, Educational Assessment

Reliability Estimation When a Test Is Split into Two Parts of Unknown Effective Length.

Peer reviewed

Feldt, Leonard S. – Applied Measurement in Education, 2002

Considers the situation in which content or administrative considerations limit the way in which a test can be partitioned to estimate the internal consistency reliability of the total test score. Demonstrates that a single-valued estimate of the total score reliability is possible only if an assumption is made about the comparative size of the…

Descriptors: Error of Measurement, Reliability, Scores, Test Construction

EFFECT OF ERROR OF MEASUREMENT ON THE POWER OF STATISTICAL TESTS. FINAL REPORT.

Download full text

CLEARY, T.A.; LINN, ROBERT L. – 1967

THE PURPOSE OF THIS RESEARCH WAS TO STUDY THE EFFECT OF ERROR OF MEASUREMENT UPON THE POWER OF STATISTICAL TESTS. ATTENTION WAS FOCUSED ON THE F-TEST OF THE SINGLE FACTOR ANALYSIS OF VARIANCE. FORMULAS WERE DERIVED TO SHOW THE RELATIONSHIP BETWEEN THE NONCENTRALITY PARAMETERS FOR ANALYSES USING TRUE SCORES AND THOSE USING OBSERVED SCORES. THE…

Descriptors: Analysis of Variance, Error of Measurement, Measurement Techniques, Psychological Testing

Generalizability Analysis for Performance Assessments of Student Achievement or School Effectiveness.

Peer reviewed

Cronbach, Lee J.; And Others – Educational and Psychological Measurement, 1997

Through the standard error, rather than a reliability coefficient, generalizability theory provides an indicator of the uncertainty attached to school and individual scores on performance assessments. Recommendations are made to apply generalizability theory to current performance assessments, emphasizing practices that differ from usual…

Descriptors: Academic Achievement, Error of Measurement, Generalizability Theory, Performance Based Assessment

The Effect of Sequential Dependence on the Sampling Distributions of KR-20, KR-21, and Split-Halves Reliabilities.

Download full text

Sullins, Walter L. – 1971

Five-hundred dichotomously scored response patterns were generated with sequentially independent (SI) items and 500 with dependent (SD) items for each of thirty-six combinations of sampling parameters (i.e., three test lengths, three sample sizes, and four item difficulty distributions). KR-20, KR-21, and Split-Half (S-H) reliabilities were…

Descriptors: Comparative Analysis, Correlation, Error of Measurement, Item Analysis

An Application of Generalizability Theory to the Validation of a Behaviorally Anchored Role-Play Measure.

Espelage, Dorothy L.; Quittner, Alexandra L.; Kamps, Jodi – 1998

Generalizability theory (g-theory) was used, as an alternative to classical test theory, to evaluate measurement error in a behaviorally anchored role-play measure, highlighting the usefulness of this theory in instrument development. G-theory partitions an observed score into the universe score and error scores associated with separate sources of…

Descriptors: Behavior Patterns, Eating Disorders, Error of Measurement, Females

Determining the Representation of Constructed Response Items in Mixed-Item Format Exams.

Download full text

Sykes, Robert C.; Truskosky, Denise; White, Hillory – 2001

The purpose of this research was to study the effect of the three different ways of increasing the number of points contributed by constructed response (CR) items on the reliability of test scores from mixed-item-format tests. The assumption of unidimensionality that underlies the accuracy of item response theory model-based standard error…

Descriptors: Constructed Response, Elementary Education, Elementary School Students, Error of Measurement

The Rasch Model for Dichotomous Items: Theory, Applications and a Computer Program. No. 63.

Download full text

Gustafsson, Jan-Eric – 1977

The Rasch model for test analysis is described and compared with two-parameter and three-parameter latent-trait models. Conditional maximum likelihood equations for estimating item parameters are derived, and estimates of person parameters are described together with their confidence intervals. Goodness of fit tests are discussed, including a…

Descriptors: Adaptive Testing, Computer Programs, Equated Scores, Error of Measurement

How To Sample in Surveys. The Survey Kit, Volume 6.

Fink, Arlene – 1995

The nine-volume Survey Kit is designed to help readers prepare and conduct surveys and become better users of survey results. All the books in the series contain instructional objectives, exercises and answers, examples of surveys in use, illustrations of survey questions, guidelines for action, checklists of "dos and don'ts," and…

Descriptors: Costs, Data Collection, Educational Research, Error of Measurement

CLEARY, T.A.	1
Cronbach, Lee J.	1
Dorans, Neil J.	1
Espelage, Dorothy L.	1
Feldt, Leonard S.	1
Ferrao, Maria	1
Fink, Arlene	1
Gustafsson, Jan-Eric	1
Haberman, Shelby J.	1
Hughes, Gail D.	1
Kamps, Jodi	1
LINN, ROBERT L.	1
Moses, Tim	1
Quittner, Alexandra L.	1
Raykov, Tenko	1
Sullins, Walter L.	1
Sykes, Robert C.	1
Truskosky, Denise	1
White, Hillory	1
More ▼