Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 7 |
Descriptor
Comparative Analysis | 31 |
Test Reliability | 31 |
Test Validity | 12 |
Scores | 7 |
Statistical Analysis | 7 |
Test Items | 7 |
Correlation | 5 |
Test Construction | 5 |
Item Analysis | 4 |
Psychometrics | 4 |
Attitude Measures | 3 |
More ▼ |
Source
Educational and Psychological… | 31 |
Author
Balogh, Jennifer | 1 |
Bernstein, Jared | 1 |
Bingham, William C. | 1 |
Brown, R. L. | 1 |
Callender, John C. | 1 |
Carvajal, Jorge | 1 |
Cheng, Jian | 1 |
Chissom, Brad S. | 1 |
Cowles, Michael | 1 |
Crowley, Susan L. | 1 |
Davis, Caroline | 1 |
More ▼ |
Publication Type
Journal Articles | 19 |
Reports - Research | 17 |
Reports - Evaluative | 3 |
Education Level
Higher Education | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Postsecondary Education | 1 |
Secondary Education | 1 |
Audience
Location
China | 1 |
Switzerland (Geneva) | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Liu, Xiaowen; Jane Rogers, H. – Educational and Psychological Measurement, 2022
Test fairness is critical to the validity of group comparisons involving gender, ethnicities, culture, or treatment conditions. Detection of differential item functioning (DIF) is one component of efforts to ensure test fairness. The current study compared four treatments for items that have been identified as showing DIF: deleting, ignoring,…
Descriptors: Item Analysis, Comparative Analysis, Culture Fair Tests, Test Validity
Zijlmans, Eva A. O.; Tijmstra, Jesper; van der Ark, L. Andries; Sijtsma, Klaas – Educational and Psychological Measurement, 2018
Reliability is usually estimated for a total score, but it can also be estimated for item scores. Item-score reliability can be useful to assess the repeatability of an individual item score in a group. Three methods to estimate item-score reliability are discussed, known as method MS, method [lambda][subscript 6], and method CA. The item-score…
Descriptors: Test Items, Test Reliability, Correlation, Comparative Analysis
Fu, Yuanshu; Wen, Zhonglin; Wang, Yang – Educational and Psychological Measurement, 2018
The maximal reliability of a congeneric measure is achieved by weighting item scores to form the optimal linear combination as the total score; it is never lower than the composite reliability of the measure when measurement errors are uncorrelated. The statistical method that renders maximal reliability would also lead to maximal criterion…
Descriptors: Test Reliability, Test Validity, Comparative Analysis, Attitude Measures
Hamby, Tyler; Taylor, Wyn – Educational and Psychological Measurement, 2016
This study examined the predictors and psychometric outcomes of survey satisficing, wherein respondents provide quick, "good enough" answers (satisficing) rather than carefully considered answers (optimizing). We administered surveys to university students and respondents--half of whom held college degrees--from a for-pay survey website,…
Descriptors: Surveys, Test Reliability, Test Validity, Comparative Analysis
Balogh, Jennifer; Bernstein, Jared; Cheng, Jian; Van Moere, Alistair; Townshend, Brent; Suzuki, Masanori – Educational and Psychological Measurement, 2012
A two-part experiment is presented that validates a new measurement tool for scoring oral reading ability. Data collected by the U.S. government in a large-scale literacy assessment of adults were analyzed by a system called VersaReader that uses automatic speech recognition and speech processing technologies to score oral reading fluency. In the…
Descriptors: Reading Fluency, Measures (Individuals), Scoring, Reading Ability
Skorupski, William P.; Carvajal, Jorge – Educational and Psychological Measurement, 2010
This study is an evaluation of the psychometric issues associated with estimating objective level scores, often referred to as "subscores." The article begins by introducing the concepts of reliability and validity for subscores from statewide achievement tests. These issues are discussed with reference to popular scaling techniques, classical…
Descriptors: Testing Programs, Test Validity, Achievement Tests, Scores

Callender, John C.; Osburn, H. G. – Educational and Psychological Measurement, 1977
An efficient algorithm for maximizing split-half reliability coefficients is described. Coefficients derived by the algorithm were found to be generally larger than odd-even split-half coefficients or other internal consistency measures and nearly as large as the largest split half coefficients. MSPLIT, Odd-Even, and Kuder-Richardson-20…
Descriptors: Comparative Analysis, Test Interpretation, Test Reliability

Hofmann, Richard J. – Educational and Psychological Measurement, 1978
The Goodenough technique for determining scale error is compared to the Guttman technique and demonstrated to be more conservative than the Guttman technique. Implications with regard to Guttman's evaluative rule of thumb for evaluating a reproducibility are noted. (Author)
Descriptors: Comparative Analysis, Rating Scales, Statistical Analysis, Test Reliability

Green, Samuel B. – Educational and Psychological Measurement, 1981
The proportion of agreement, G, and kappa indexes are shown to differ in how they correct for chance agreements between two observers. On the basis of the findings, it is suggested that no single agreement index is appropriate for all sets of data. (Author/BW)
Descriptors: Comparative Analysis, Measurement Techniques, Test Reliability, Testing Problems

Martois, John S. – Educational and Psychological Measurement, 1973
Copies of this program may be obtained from the author at the University of Southern California, School of Pharmacy, University Park, Los Angeles 90007. (CB)
Descriptors: Comparative Analysis, Computer Programs, Input Output, Statistical Analysis

Wagner, Edwin E.; And Others – Educational and Psychological Measurement, 1990
Maximized correlation as an internal reliability estimate for tests with few items was investigated. An actual sampling distribution of maximum correlation--"r" max--was empirically derived from 100 samples of 50 cases each from Rorschach test data and compared with those of alpha and an odd/even split, using 2,020 Rorschach protocols.…
Descriptors: Comparative Analysis, Correlation, Estimation (Mathematics), Sample Size

Raju, Nambury S. – Educational and Psychological Measurement, 1977
A rederivation of Lord's formula for estimating variance in multiple matrix sampling is presented as well as the ways Cronbach's coefficient alpha and the Spearman-Brown prophecy formula are related in this context. (Author/JKS)
Descriptors: Analysis of Variance, Comparative Analysis, Item Sampling, Mathematical Models

Goolsby, Thomas M., Jr. – Educational and Psychological Measurement, 1971
Descriptors: Achievement Tests, Comparative Analysis, Standardized Tests, Test Reliability

Wesner, Chester E. – Educational and Psychological Measurement, 1973
Results indicate that because there is not an equivalent relationship between the WISC and WAIS, classification or retardation level and prognostic formulation using these tests should be made cautiously. (Author/CB)
Descriptors: Adolescents, Comparative Analysis, Intelligence Quotient, Intelligence Tests

Nickel, Ted – Educational and Psychological Measurement, 1971
Directions are provided for the construction of a reduced size Rod and Frame Test. Simpler and less expensive, the proposed apparatus has criterion validity parallel to that of the full-sized. (GS)
Descriptors: Comparative Analysis, Psychological Studies, Sex Differences, Statistical Analysis