ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	7

Descriptor

Comparative Analysis	31
Test Reliability	31
Test Validity	12
Scores	7
Statistical Analysis	7
Test Items	7
Correlation	5
Test Construction	5
Item Analysis	4
Psychometrics	4
Attitude Measures	3
Computer Programs	3
Higher Education	3
Mathematical Models	3
Monte Carlo Methods	3
Personality Measures	3
Rating Scales	3
Standardized Tests	3
Achievement Tests	2
Analysis of Variance	2
College Students	2
Computation	2
Computer Assisted Testing	2
Difficulty Level	2
Elementary Education	2
More ▼

Source

Educational and Psychological…

Publication Type

Journal Articles	19
Reports - Research	17
Reports - Evaluative	3

Education Level

Higher Education	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

China	1
Switzerland (Geneva)	1

Laws, Policies, & Programs

Assessments and Surveys

Childrens Depression Inventory	1
Embedded Figures Test	1
Eysenck Personality Inventory	1
Iowa Tests of Basic Skills	1
Marlowe Crowne Social…	1
Metropolitan Achievement Tests	1
Minnesota Multiphasic…	1
Rod and Frame Test	1
Rorschach Test	1
Rosenberg Self Esteem Scale	1
Rotter Internal External…	1
Stanford Achievement Tests	1
Wechsler Adult Intelligence…	1
Wechsler Intelligence Scale…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 31 results Save | Export

Treatments of Differential Item Functioning: A Comparison of Four Methods

Peer reviewed

Direct link

Liu, Xiaowen; Jane Rogers, H. – Educational and Psychological Measurement, 2022

Test fairness is critical to the validity of group comparisons involving gender, ethnicities, culture, or treatment conditions. Detection of differential item functioning (DIF) is one component of efforts to ensure test fairness. The current study compared four treatments for items that have been identified as showing DIF: deleting, ignoring,…

Descriptors: Item Analysis, Comparative Analysis, Culture Fair Tests, Test Validity

Item-Score Reliability in Empirical-Data Sets and Its Relationship with Other Item Indices

Peer reviewed

Direct link

Zijlmans, Eva A. O.; Tijmstra, Jesper; van der Ark, L. Andries; Sijtsma, Klaas – Educational and Psychological Measurement, 2018

Reliability is usually estimated for a total score, but it can also be estimated for item scores. Item-score reliability can be useful to assess the repeatability of an individual item score in a group. Three methods to estimate item-score reliability are discussed, known as method MS, method [lambda][subscript 6], and method CA. The item-score…

Descriptors: Test Items, Test Reliability, Correlation, Comparative Analysis

The Total Score with Maximal Reliability and Maximal Criterion Validity: An Illustration Using a Career Satisfaction Measure

Peer reviewed

Direct link

Fu, Yuanshu; Wen, Zhonglin; Wang, Yang – Educational and Psychological Measurement, 2018

The maximal reliability of a congeneric measure is achieved by weighting item scores to form the optimal linear combination as the total score; it is never lower than the composite reliability of the measure when measurement errors are uncorrelated. The statistical method that renders maximal reliability would also lead to maximal criterion…

Descriptors: Test Reliability, Test Validity, Comparative Analysis, Attitude Measures

Survey Satisficing Inflates Reliability and Validity Measures: An Experimental Comparison of College and Amazon Mechanical Turk Samples

Peer reviewed

Direct link

Hamby, Tyler; Taylor, Wyn – Educational and Psychological Measurement, 2016

This study examined the predictors and psychometric outcomes of survey satisficing, wherein respondents provide quick, "good enough" answers (satisficing) rather than carefully considered answers (optimizing). We administered surveys to university students and respondents--half of whom held college degrees--from a for-pay survey website,…

Descriptors: Surveys, Test Reliability, Test Validity, Comparative Analysis

Validation of Automated Scoring of Oral Reading

Peer reviewed

Direct link

Balogh, Jennifer; Bernstein, Jared; Cheng, Jian; Van Moere, Alistair; Townshend, Brent; Suzuki, Masanori – Educational and Psychological Measurement, 2012

A two-part experiment is presented that validates a new measurement tool for scoring oral reading ability. Data collected by the U.S. government in a large-scale literacy assessment of adults were analyzed by a system called VersaReader that uses automatic speech recognition and speech processing technologies to score oral reading fluency. In the…

Descriptors: Reading Fluency, Measures (Individuals), Scoring, Reading Ability

A Comparison of Approaches for Improving the Reliability of Objective Level Scores

Peer reviewed

Direct link

Skorupski, William P.; Carvajal, Jorge – Educational and Psychological Measurement, 2010

This study is an evaluation of the psychometric issues associated with estimating objective level scores, often referred to as "subscores." The article begins by introducing the concepts of reliability and validity for subscores from statewide achievement tests. These issues are discussed with reference to popular scaling techniques, classical…

Descriptors: Testing Programs, Test Validity, Achievement Tests, Scores

A Method for Maximizing Split-Half Reliability Coefficients

Peer reviewed

Callender, John C.; Osburn, H. G. – Educational and Psychological Measurement, 1977

An efficient algorithm for maximizing split-half reliability coefficients is described. Coefficients derived by the algorithm were found to be generally larger than odd-even split-half coefficients or other internal consistency measures and nearly as large as the largest split half coefficients. MSPLIT, Odd-Even, and Kuder-Richardson-20…

Descriptors: Comparative Analysis, Test Interpretation, Test Reliability

A Modified Rule of Thumb for Evaluating Scale Reproducibilities Determined by Electronic Computers

Peer reviewed

Hofmann, Richard J. – Educational and Psychological Measurement, 1978

The Goodenough technique for determining scale error is compared to the Guttman technique and demonstrated to be more conservative than the Guttman technique. Implications with regard to Guttman's evaluative rule of thumb for evaluating a reproducibility are noted. (Author)

Descriptors: Comparative Analysis, Rating Scales, Statistical Analysis, Test Reliability

A Comparison of Three Indexes of Agreement between Observers: Proportion of Agreement, G-Index, and Kappa.

Peer reviewed

Green, Samuel B. – Educational and Psychological Measurement, 1981

The proportion of agreement, G, and kappa indexes are shown to differ in how they correct for chance agreements between two observers. On the basis of the findings, it is suggested that no single agreement index is appropriate for all sets of data. (Author/BW)

Descriptors: Comparative Analysis, Measurement Techniques, Test Reliability, Testing Problems

A Computer Program to Compute Kristof's Procedure for Testing Significance of the Differences between Reliability Coefficients

Peer reviewed

Martois, John S. – Educational and Psychological Measurement, 1973

Copies of this program may be obtained from the author at the University of Southern California, School of Pharmacy, University Park, Los Angeles 90007. (CB)

Descriptors: Comparative Analysis, Computer Programs, Input Output, Statistical Analysis

An Empirical Demonstration of the Stability of the Maximized Correlation as an Internal-Consistency Reliability Estimate for Tests of Small Item Size.

Peer reviewed

Wagner, Edwin E.; And Others – Educational and Psychological Measurement, 1990

Maximized correlation as an internal reliability estimate for tests with few items was investigated. An actual sampling distribution of maximum correlation--"r" max--was empirically derived from 100 samples of 50 cases each from Rorschach test data and compared with those of alpha and an odd/even split, using 2,020 Rorschach protocols.…

Descriptors: Comparative Analysis, Correlation, Estimation (Mathematics), Sample Size

On Estimating Test Variance in Multiple Matrix Sampling

Peer reviewed

Raju, Nambury S. – Educational and Psychological Measurement, 1977

A rederivation of Lord's formula for estimating variance in multiple matrix sampling is presented as well as the ways Cronbach's coefficient alpha and the Spearman-Brown prophecy formula are related in this context. (Author/JKS)

Descriptors: Analysis of Variance, Comparative Analysis, Item Sampling, Mathematical Models

Appropriateness of Subtests in Achievement Tests Selection

Peer reviewed

Goolsby, Thomas M., Jr. – Educational and Psychological Measurement, 1971

Descriptors: Achievement Tests, Comparative Analysis, Standardized Tests, Test Reliability

The Relationship Between WISC and WAIS IQs with Educable Mentally Retarded Adolescents

Peer reviewed

Wesner, Chester E. – Educational and Psychological Measurement, 1973

Results indicate that because there is not an equivalent relationship between the WISC and WAIS, classification or retardation level and prognostic formulation using these tests should be made cautiously. (Author/CB)

Descriptors: Adolescents, Comparative Analysis, Intelligence Quotient, Intelligence Tests

The Reduced Size Rod and Frame Test as a Measure of Psychological Differentiation

Peer reviewed

Nickel, Ted – Educational and Psychological Measurement, 1971

Directions are provided for the construction of a reduced size Rod and Frame Test. Simpler and less expensive, the proposed apparatus has criterion validity parallel to that of the full-sized. (GS)

Descriptors: Comparative Analysis, Psychological Studies, Sex Differences, Statistical Analysis

Previous Page | Next Page »

Pages: 1 | 2 | 3

Balogh, Jennifer	1
Bernstein, Jared	1
Bingham, William C.	1
Brown, R. L.	1
Callender, John C.	1
Carvajal, Jorge	1
Cheng, Jian	1
Chissom, Brad S.	1
Cowles, Michael	1
Crowley, Susan L.	1
Davis, Caroline	1
Douglass, Frazier M., IV	1
Enders, Craig K	1
Fu, Yuanshu	1
Goolsby, Thomas M., Jr.	1
Green, Kathy	1
Green, Samuel B.	1
Hamby, Tyler	1
Harley, Dwight	1
Hildebrand, Myrene	1
Hofmann, Richard J.	1
Hoover, H. D.	1
Huck, Schuyler W.	1
Jackson, Douglas N.	1
More ▼