ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	1

Descriptor

Test Reliability	20
Test Validity	20
Higher Education	9
Test Construction	8
Test Items	5
Scoring	4
Item Analysis	3
Response Style (Tests)	3
Achievement Tests	2
Adaptive Testing	2
Analysis of Variance	2
Cognitive Style	2
Cognitive Tests	2
Computer Assisted Testing	2
Difficulty Level	2
Elementary Secondary Education	2
Evaluation Methods	2
Foreign Countries	2
Item Banks	2
Item Response Theory	2
Males	2
Measurement Techniques	2
Personality Measures	2
Psychometrics	2
Rating Scales	2
More ▼

Source

Applied Psychological…

Publication Type

Journal Articles	12
Reports - Research	7
Reports - Evaluative	5
Information Analyses	1

Education Level

Audience

Location

West Germany

Laws, Policies, & Programs

Assessments and Surveys

Defining Issues Test	1
Hidden Figures Test	1
Rod and Frame Test	1
Strong Campbell Interest…	1
Washington University…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

A Critique of Raju and Oshima's Prophecy Formulas for Assessing the Reliability of Item Response Theory-Based Ability Estimates

Peer reviewed

Direct link

Wang, Wen-Chung – Applied Psychological Measurement, 2008

Raju and Oshima (2005) proposed two prophecy formulas based on item response theory in order to predict the reliability of ability estimates for a test after change in its length. The first prophecy formula is equivalent to the classical Spearman-Brown prophecy formula. The second prophecy formula is misleading because of an underlying false…

Descriptors: Test Reliability, Item Response Theory, Computation, Evaluation Methods

Tolerance Intervals: Alternatives to Credibility Intervals in Validity Generalization Research.

Peer reviewed

Millsap, Roger E. – Applied Psychological Measurement, 1988

Two new methods for constructing a credibility interval (CI)--an interval containing a specified proportion of true validity description--are discussed, from a frequentist perspective. Tolerance intervals, unlike the current method of constructing the CI, have performance characteristics across repeated applications and may be useful in validity…

Descriptors: Bayesian Statistics, Meta Analysis, Statistical Analysis, Test Reliability

Suppose We Measured Height With Rating Scales Instead of Rulers

Peer reviewed

Dawes, Robyn M. – Applied Psychological Measurement, 1977

Staff members of the Psychology department at the University of Oregon rated each other's height on five rating scales representative of those found in social psychology. Average ratings proved to be very good estimates of height. (Author/JKS)

Descriptors: College Faculty, Height, Males, Measurement Techniques

An Empirical Investigation of the Stratified Adaptive Computerized Testing Model

Peer reviewed

Waters, Brian K. – Applied Psychological Measurement, 1977

The validity and utility of the stratified adaptive computerized testing model (stradaptive) developed by Weiss are empirically investigated. The model presents a tailored testing strategy based upon Binet IQ measurement theory and Lord's modern test theory. (Author/RC)

Descriptors: Ability, Adaptive Testing, Computer Oriented Programs, Item Banks

An Examination of the Construct Validity and Reliability of the Ghiselli Self-Description Inventory as a Measure of Self-Esteem

Peer reviewed

Raben, Charles S.; And Others – Applied Psychological Measurement, 1978

Two studies are reported which investigated the construct validity and reliability of the Ghiselli Self-Description Inventory as a measure of self-esteem. The first study, using a multitrait-multimethod matrix, found little evidence for the construct validity of the instrument. The second study found a significant, although low, reliability. (…

Descriptors: Achievement Need, Higher Education, Locus of Control, Self Concept Measures

Scoring Field Dependence: A Methodological Analysis of Five Rod-and-Frame Scoring Systems

Peer reviewed

McGarvey, Bill; And Others – Applied Psychological Measurement, 1977

The most consistently used scoring system for the rod-and-frame task has been the total number of degrees in error from the true vertical. Since a logical case can be made for at least four alternative scoring systems, a thorough comparison of all five systems was performed. (Author/CTM)

Descriptors: Analysis of Variance, Cognitive Style, Cognitive Tests, Elementary Education

A Paper-and-Pencil Inventory for the Assessment of Piaget's Tasks.

Peer reviewed

Patterson, Henry O,; Milakofsky, Louis – Applied Psychological Measurement, 1980

Adapting curricula to the cognitive developmental level of students has been hindered by the difficulty of assessing those levels in students. The reliability and validity of a paper-and-pencil Piagetian assessment are discussed. (Author/ JKS)

Descriptors: Cognitive Development, Cognitive Measurement, Elementary Secondary Education, Grade 3

Measures for the Study of Maternal Teaching Strategies.

Peer reviewed

Laosa, Luis M. – Applied Psychological Measurement, 1980

A technique to measure maternal teaching strategies was developed for possible use in research and evaluation studies. Scores derived from the technique describe quality and quanitity of behaviors used by mothers to teach cognitive-perceptual tasks to their own young children. Reliability and validity data are presented. (Author/JKS)

Descriptors: Cultural Differences, Measurement Techniques, Mothers, Observation

Construction Strategies for Multiscale Personality Inventories

Peer reviewed

Burisch, Matthias – Applied Psychological Measurement, 1978

Sets of inventory scales were constructed from a common item pool, using variants of what are here called the Inductive, Deductive, and External strategies. Peer ratings for 21 traits served as criteria. Very little variation in validity was attributable to construction strategies. (Author/CTM)

Descriptors: Deduction, Foreign Countries, Higher Education, Induction

Multidimensional Computerized Adaptive Testing in a Certification or Licensure Context.

Peer reviewed

Luecht, Richard M. – Applied Psychological Measurement, 1996

The example of a medical licensure test is used to demonstrate situations in which complex, integrated content must be balanced at the total test level for validity reasons, but items assigned to reportable subscore categories may be used under a multidimensional item response theory adaptive paradigm to improve subscore reliability. (SLD)

Descriptors: Adaptive Testing, Certification, Computer Assisted Testing, Licensing Examinations (Professions)

The Reliability and Validity of Objective Indices of Moral Development.

Peer reviewed

Davison, Mark L.; Robbins, Stephen – Applied Psychological Measurement, 1978

Empirically weighted scores for Rest's Defining Issues Test were found to be more reliable than the simple sum of scores theoretically weighted sum, or Rest's p scores. They also had slightly higher correlations with Kohlberg's interview scores. Empirically weighted scores also showed more significant change in two longitudinal studies. (CTM)

Descriptors: Higher Education, Longitudinal Studies, Moral Development, Moral Values

Development of a Self-Report Inventory for Assessing Individual Differences in Learning Processes

Peer reviewed

Schmeck, Ronald Ray; And Others – Applied Psychological Measurement, 1977

Five studies are presented describing the development of a self-report inventory for measuring individual differences in learning processes. Factor analysis of items yielded four scales: Synthesis-Analysis, Study Methods, Fact Retention, and Elaborative Processing. There were no sex differences, and the scales demonstrated acceptable reliabilities…

Descriptors: Factor Analysis, Higher Education, Learning Processes, Retention (Psychology)

An Assessment of the Role Construct Repertory Test.

Peer reviewed

Menasco, Michael B.; Curry, David J. – Applied Psychological Measurement, 1978

Scores on the Role Construct Repertory Test exhibited significant correlations with other forms of cognitive functioning, including American College Test scores in science and mathematics for a group of 79 college students. The Grid Form of the test was used. Test-retest reliability was low. (Author/CTM)

Descriptors: Achievement Tests, Cognitive Processes, Cognitive Style, Cognitive Tests

Empirical versus Random Item Selection in the Design of Intelligence Test Short Forms--The WISC-R Example.

Peer reviewed

Goh, David S. – Applied Psychological Measurement, 1979

The advantages of using psychometric thoery to design short forms of intelligence tests are demonstrated by comparing such usage to a systematic random procedure that has previously been used. The Wechsler Intelligence Scale for Children Revised (WISC-R) Short Form is presented as an example. (JKS)

Descriptors: Elementary Secondary Education, Intelligence Tests, Item Analysis, Psychometrics

An Examination of Methodological Issues Relevant to the Use and Interpretation of the Semantic Differential.

Peer reviewed

And Others; Mann, Irene T. – Applied Psychological Measurement, 1979

Several methodological problems (particularly the assumed bipolarity of scales, instructions regarding use of the midpoint, and concept-scale interaction) which may contribute to a lack of precision in the semantic differential technique were investigated. Results generally supported the use of the semantic differential. (Author/JKS)

Descriptors: Analysis of Variance, Computer Assisted Testing, Higher Education, Rating Scales

Previous Page | Next Page »

Pages: 1 | 2

Bejar, Isaac I.	1
Budescu, David V.	1
Burisch, Matthias	1
Curry, David J.	1
Davison, Mark L.	1
Dawes, Robyn M.	1
Downey, Ronald G.	1
Goh, David S.	1
Johnson, Richard W.	1
Laosa, Luis M.	1
Loevinger, Jane	1
Luecht, Richard M.	1
Mann, Irene T.	1
McGarvey, Bill	1
Menasco, Michael B.	1
Milakofsky, Louis	1
Millsap, Roger E.	1
Patterson, Henry O,	1
Raben, Charles S.	1
Robbins, Stephen	1
Schmeck, Ronald Ray	1
Wang, Wen-Chung	1
Waters, Brian K.	1
Yocom, Peter	1
More ▼