Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 4 |
Descriptor
Test Reliability | 16 |
Test Validity | 6 |
Evaluation Methods | 4 |
Test Construction | 4 |
Error of Measurement | 3 |
Factor Structure | 3 |
Item Response Theory | 3 |
Rating Scales | 3 |
Test Theory | 3 |
Computation | 2 |
Computer Software | 2 |
More ▼ |
Source
Educational and Psychological… | 16 |
Author
Byrnes, Katherine | 1 |
Capie, William | 1 |
Capobianco, Sal | 1 |
Chalmers, R. Philip | 1 |
Chen, Hsueh-Chu | 1 |
Cicchetti, Domenic V. | 1 |
Davis, Mark H. | 1 |
French, Brian F. | 1 |
Goldammer, Philippe | 1 |
Kraus, Linda A. | 1 |
Kroc, Edward | 1 |
More ▼ |
Publication Type
Journal Articles | 16 |
Reports - Descriptive | 16 |
Book/Product Reviews | 1 |
Opinion Papers | 1 |
Reports - Evaluative | 1 |
Education Level
Higher Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
General Educational… | 1 |
Test of Logical Thinking | 1 |
What Works Clearinghouse Rating
Chalmers, R. Philip – Educational and Psychological Measurement, 2018
This article discusses the theoretical and practical contributions of Zumbo, Gadermann, and Zeisser's family of ordinal reliability statistics. Implications, interpretation, recommendations, and practical applications regarding their ordinal measures, particularly ordinal alpha, are discussed. General misconceptions relating to this family of…
Descriptors: Misconceptions, Test Theory, Test Reliability, Statistics
Zumbo, Bruno D.; Kroc, Edward – Educational and Psychological Measurement, 2019
Chalmers recently published a critique of the use of ordinal a[alpha] proposed in Zumbo et al. as a measure of test reliability in certain research settings. In this response, we take up the task of refuting Chalmers' critique. We identify three broad misconceptions that characterize Chalmers' criticisms: (1) confusing assumptions with…
Descriptors: Test Reliability, Statistical Analysis, Misconceptions, Mathematical Models
Nicewander, W. Alan – Educational and Psychological Measurement, 2019
This inquiry is focused on three indicators of the precision of measurement--conditional on fixed values of ?, the latent variable of item response theory (IRT). The indicators that are compared are (1) The traditional, conditional standard errors, s(eX|?) = CSEM; (2) the IRT-based conditional standard errors, s[subscript irt](eX|?)=C[subscript…
Descriptors: Measurement, Accuracy, Scores, Error of Measurement
Raykov, Tenko; Goldammer, Philippe; Marcoulides, George A.; Li, Tatyana; Menold, Natalja – Educational and Psychological Measurement, 2018
A readily applicable procedure is discussed that allows evaluation of the discrepancy between the popular coefficient alpha and the reliability coefficient of a scale with second-order factorial structure that is frequently of relevance in empirical educational and psychological research. The approach is developed within the framework of the…
Descriptors: Test Reliability, Factor Structure, Statistical Analysis, Computation

Wilcox, Rand R. – Educational and Psychological Measurement, 1981
This paper describes and compares procedures for estimating the reliability of proficiency tests that are scored with latent structure models. Results suggest that the predictive estimate is the most accurate of the procedures. (Author/BW)
Descriptors: Criterion Referenced Tests, Scoring, Test Reliability

Cicchetti, Domenic V.; And Others – Educational and Psychological Measurement, 1984
This program computes multiple judge reliability levels under the following conditions. (1) different sets of judges perform the ratings; (2) the number of judges is a constant; and (3) the scale of measurement is nominal. (Author)
Descriptors: Computer Software, Interrater Reliability, Judgment Analysis Technique, Test Reliability

Morris, John D. – Educational and Psychological Measurement, 1979
A computer program which creates and stores a cumulative item covariance matrix upon each administration of an instrument is described. Use of this program would facilitate keeping a constant log on reliability in situations in which the test is administered to different groups over a period of time. (Author/JKS)
Descriptors: Analysis of Covariance, Computer Programs, Correlation, Item Analysis
Raju, Nambury S.; Oshima, T.C. – Educational and Psychological Measurement, 2005
Two new prophecy formulas for estimating item response theory (IRT)-based reliability of a shortened or lengthened test are proposed. Some of the relationships between the two formulas, one of which is identical to the well-known Spearman-Brown prophecy formula, are examined and illustrated. The major assumptions underlying these formulas are…
Descriptors: Item Response Theory, Test Reliability, Evaluation Methods, Computation

Whitney, Douglas R.; And Others – Educational and Psychological Measurement, 1986
This paper summarizes much of the available information concerning the reliability and validity of the Tests of General Educational Development (GED Tests). The data suggest that the results are sufficiently reliable for continued use and that the validity evidence generally supports the intended uses of the tests. (Author/LMO)
Descriptors: Correlation, Equivalency Tests, Error of Measurement, Predictive Validity

Tobin, Kenneth G.; Capie, William – Educational and Psychological Measurement, 1981
The Test of Logical Thinking was designed to measure five modes of formal reasoning: controlling variables, proportional reasoning, combinatorial reasoning, probabilistic reasoning, and correlational reasoning. Analysis of data from students, grades 6-16 indicated high test reliability and confirmed that the test measures one underlying dimension,…
Descriptors: Factor Structure, Higher Education, Logical Thinking, Secondary Education
French, Brian F.; Oakes, William – Educational and Psychological Measurement, 2004
The Institutional Integration Scale is claimed to measure five facets of college student academic and social integration. The scale was based on Tintos model of college student withdrawal. Psychometric properties of the scale were examined based on a sample of 1st-year college students. These results led to item revisions and additions. The scale…
Descriptors: Measures (Individuals), Psychometrics, Social Integration, Test Validity

Luecht, Richard M. – Educational and Psychological Measurement, 1987
Test Pac, a test scoring and analysis computer program for moderate-sized sample designs using dichotomous response items, performs comprehensive item analyses and multiple reliability estimates. It also performs single-facet generalizability analysis of variance, single-parameter item response theory analyses, test score reporting, and computer…
Descriptors: Computer Assisted Testing, Computer Software, Computer Software Reviews, Item Analysis
Torff, Bruce; Sessions, David; Byrnes, Katherine – Educational and Psychological Measurement, 2005
This article reports three studies in which a scale for assessing teachers' beliefs about professional-development initiatives was developed and its scores evaluated for reliability and validity. Results indicated that the Teachers' Attitudes About Professional Development (TAP) scale produced scores with high reliability, a stable one-factor…
Descriptors: Measures (Individuals), Test Reliability, Test Validity, Self Efficacy

Lunneborg, Patricia W. – Educational and Psychological Measurement, 1979
The development and validation of the Vocational Interest Inventory, a forced choice guidance instrument is described. It assists high school students, whose interests are not well differentiated, in making post-high school educational and vocational decisions. (JKS)
Descriptors: Factor Structure, Forced Choice Technique, Interest Inventories, Secondary Education
Wang, Wen-Chung; Chen, Hsueh-Chu – Educational and Psychological Measurement, 2004
As item response theory (IRT) becomes popular in educational and psychological testing, there is a need of reporting IRT-based effect size measures. In this study, we show how the standardized mean difference can be generalized into such a measure. A disattenuation procedure based on the IRT test reliability is proposed to correct the attenuation…
Descriptors: Test Reliability, Rating Scales, Sample Size, Error of Measurement
Previous Page | Next Page ยป
Pages: 1 | 2