ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	3

Descriptor

Test Reliability	15
Mathematical Models	9
Models	5
Error of Measurement	4
Psychometrics	4
Test Items	4
Equations (Mathematics)	3
Item Response Theory	3
Rating Scales	3
Sampling	3
Cognitive Processes	2
Construct Validity	2
Difficulty Level	2
Estimation (Mathematics)	2
Higher Education	2
Individual Differences	2
Item Analysis	2
Measurement	2
Power (Statistics)	2
Scores	2
Test Construction	2
Testing Problems	2
Weighted Scores	2
Adults	1
Analysis of Covariance	1
More ▼

Source

Applied Psychological…

Publication Type

Journal Articles	11
Reports - Evaluative	6
Reports - Descriptive	3
Opinion Papers	1
Reports - General	1
Reports - Research	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Location

West Germany

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	1
Hidden Figures Test	1

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Dynamic Problem Solving: A New Assessment Perspective

Peer reviewed

Direct link

Greiff, Samuel; Wustenberg, Sascha; Funke, Joachim – Applied Psychological Measurement, 2012

This article addresses two unsolved measurement issues in dynamic problem solving (DPS) research: (a) unsystematic construction of DPS tests making a comparison of results obtained in different studies difficult and (b) use of time-intensive single tasks leading to severe reliability problems. To solve these issues, the MicroDYN approach is…

Descriptors: Problem Solving, Tests, Measurement, Structural Equation Models

Multinomial and Compound Multinomial Error Models for Tests with Complex Item Scoring

Peer reviewed

Direct link

Lee, Won-Chan – Applied Psychological Measurement, 2007

This article introduces a multinomial error model, which models an examinee's test scores obtained over repeated measurements of an assessment that consists of polytomously scored items. A compound multinomial error model is also introduced for situations in which items are stratified according to content categories and/or prespecified numbers of…

Descriptors: Simulation, Error of Measurement, Scoring, Test Items

Comparison of the Null Distributions of Weighted Kappa and the C Ordinal Statistic

Peer reviewed

Cicchetti, Domenic V.; Fleiss, Joseph L. – Applied Psychological Measurement, 1977

The weighted kappa coefficient is a measure of interrater agreement when the relative seriousness of each possible disagreement can be quantified. This monte carlo study demonstrates the utility of the kappa coefficient for ordinal data. Sample size is also briefly discussed. (Author/JKS)

Descriptors: Mathematical Models, Rating Scales, Reliability, Sampling

Some Comments on the Relation between Reliability and Statistical Power.

Peer reviewed

Humphreys, Lloyd G.; Drasgow, Fritz – Applied Psychological Measurement, 1989

Issues arising from difference scores with zero reliability that nevertheless allow a powerful test of change are discussed. Issues include the appropriateness of underlying statistical models for psychological data and the relationship between difference scores and power. Increases in reliability always increase power for a fixed effect size.…

Descriptors: Goodness of Fit, Mathematical Models, Power (Statistics), Psychometrics

Exploring Local Item Dependence Using a Random-Effects Facet Model

Peer reviewed

Direct link

Wang, Wen-Chung; Wilson, Mark – Applied Psychological Measurement, 2005

The random-effects facet model that deals with local item dependence in many-facet contexts is presented. It can be viewed as a special case of the multidimensional random coefficients multinomial logit model (MRCMLM) so that the estimation procedures for the MRCMLM can be directly applied. Simulations were conducted to examine parameter recovery…

Descriptors: Test Reliability, Item Response Theory, Interrater Reliability, Rating Scales

Estimating Measurement Error on Highly Speeded Tests.

Peer reviewed

Whitely, Susan E. – Applied Psychological Measurement, 1979

A model which gives maximum likelihood estimates of measurement error within the context of a simplex model for practice effects is presented. The appropriateness of the model is tested for five traits, and error estimates are compared to the classical formula estimates. (Author/JKS)

Descriptors: Error of Measurement, Error Patterns, Higher Education, Mathematical Models

Contributions to the Method of Paired Comparisons.

Peer reviewed

Kaiser, Henry F.; Serlin, Ronald C. – Applied Psychological Measurement, 1978

A least-squares solution for the method of paired comparisons is given. The approach provokes a theorem regarding the amount of data necessary and sufficient for a solution to be obtained. A measure of the internal consistency of the least-squares fit is developed. (Author/CTM)

Descriptors: Higher Education, Least Squares Statistics, Mathematical Models, Measurement

The Effect of Guessing on Item Reliability under Answer-Until-Correct Scoring

Peer reviewed

Kane, Michael; Moloney, James – Applied Psychological Measurement, 1978

The answer-until-correct (AUC) procedure requires that examinees respond to a multi-choice item until they answer it correctly. Using a modified version of Horst's model for examinee behavior, this paper compares the effect of guessing on item reliability for the AUC procedure and the zero-one scoring procedure. (Author/CTM)

Descriptors: Guessing (Tests), Item Analysis, Mathematical Models, Multiple Choice Tests

Reliability of Measurement and Power of Significance Tests Based on Differences.

Peer reviewed

Zimmerman, Donald W.; And Others – Applied Psychological Measurement, 1993

Some of the methods originally used to find relationships between reliability and power associated with a single measurement are extended to difference scores. Results, based on explicit power calculations, show that augmenting the reliability of measurement by reducing error score variance can make significance tests of difference more powerful.…

Descriptors: Equations (Mathematics), Error of Measurement, Individual Differences, Mathematical Models

Planning an Experiment in the Company of Measurement Error

Peer reviewed

Levin, Joel R.; Subkoviak, Michael J. – Applied Psychological Measurement, 1977

Textbook calculations of statistical power or sample size follow from formulas that assume that the variables under consideration are measured without error. However, in the real world of behavioral research, errors of measurement cannot be neglected. The determination of sample size is discussed, and an example illustrates blocking strategy.…

Descriptors: Analysis of Covariance, Analysis of Variance, Error of Measurement, Hypothesis Testing

The Reliability of a Linear Composite of Nonequivalent Subtests.

Peer reviewed

Rozeboom, William W. – Applied Psychological Measurement, 1989

Formulas are provided for estimating the reliability of a linear composite of non-equivalent subtests given the reliabilities of component subtests. The reliability of the composite is compared to that of its components. An empirical example uses data from 170 children aged 4 through 8 years performing 34 Piagetian tasks. (SLD)

Descriptors: Elementary School Students, Equations (Mathematics), Estimation (Mathematics), Mathematical Models

Sequential Reliability Tests.

Peer reviewed

Eiting, Mindert H. – Applied Psychological Measurement, 1991

A method is proposed for sequential evaluation of reliability of psychometric instruments. Sample size is unfixed; a test statistic is computed after each person is sampled and a decision is made in each stage of the sampling process. Results from a series of Monte-Carlo experiments establish the method's efficiency. (SLD)

Descriptors: Computer Simulation, Equations (Mathematics), Estimation (Mathematics), Mathematical Models

Problems with Individual Difference Measures Based on Some Componential Cognitive Paradigms.

Peer reviewed

Dunlap, William P.; And Others – Applied Psychological Measurement, 1989

The reliability of derived measures from 4 cognitive paradigms was studied using 19 Navy enlisted men (aged between 18 and 24 years). The paradigms were: graphemic and phonemic analysis; semantic memory retrieval; lexical decision making; and letter classification. Results indicate that derived scores may have low reliability. (SLD)

Descriptors: Adults, Armed Forces, Cognitive Measurement, Cognitive Processes

A Generative Approach to the Modeling of Isomorphic Hidden-Figure Items.

Peer reviewed

Bejar, Isaac I.; Yocom, Peter – Applied Psychological Measurement, 1991

An approach to test modeling is illustrated that encompasses both response consistency and response difficulty. This generative approach makes validation an ongoing process. An analysis of hidden figure items with 60 high school students supports the feasibility of the method. (SLD)

Descriptors: Construct Validity, Difficulty Level, Evaluation Methods, High School Students

Item Difficulty Modeling of Paragraph Comprehension Items

Peer reviewed

Direct link

Gorin, Joanna S.; Embretson, Susan E. – Applied Psychological Measurement, 2006

Recent assessment research joining cognitive psychology and psychometric theory has introduced a new technology, item generation. In algorithmic item generation, items are systematically created based on specific combinations of features that underlie the processing required to correctly solve a problem. Reading comprehension items have been more…

Descriptors: Difficulty Level, Test Items, Modeling (Psychology), Paragraph Composition

Bejar, Isaac I.	1
Cicchetti, Domenic V.	1
Drasgow, Fritz	1
Dunlap, William P.	1
Eiting, Mindert H.	1
Embretson, Susan E.	1
Fleiss, Joseph L.	1
Funke, Joachim	1
Gorin, Joanna S.	1
Greiff, Samuel	1
Humphreys, Lloyd G.	1
Kaiser, Henry F.	1
Kane, Michael	1
Lee, Won-Chan	1
Levin, Joel R.	1
Moloney, James	1
Rozeboom, William W.	1
Serlin, Ronald C.	1
Subkoviak, Michael J.	1
Wang, Wen-Chung	1
Whitely, Susan E.	1
Wilson, Mark	1
Wustenberg, Sascha	1
Yocom, Peter	1
More ▼