ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	8

Descriptor

Evaluation Methods	20
Statistical Analysis	20
Scaling	13
Multidimensional Scaling	8
Item Analysis	5
Item Response Theory	5
Measurement Techniques	5
Psychometrics	5
Test Construction	5
Test Items	5
Scores	4
Test Reliability	4
Achievement Tests	3
Correlation	3
Data Analysis	3
Data Collection	3
Higher Education	3
Models	3
Simulation	3
Test Bias	3
Test Validity	3
Accreditation (Institutions)	2
Case Studies	2
College Students	2
Difficulty Level	2
More ▼

Source

Educational and Psychological…	2
American Journal of Evaluation	1
Applied Psychological…	1
Assessment in Education:…	1
Educational Measurement:…	1
Educational Sciences: Theory…	1
Journal of Educational and…	1
Journal of Speech and Hearing…	1
Measurement in Physical…	1
Psychometrika	1
Routledge, Taylor & Francis…	1
More ▼

Publication Type

Reports - Research	11
Journal Articles	10
Speeches/Meeting Papers	4
Reports - Evaluative	3
Books	1
Collected Works - General	1
Guides - General	1
Information Analyses	1
Reports - Descriptive	1
Tests/Questionnaires	1

Education Level

Higher Education	3
Elementary Education	2
Elementary Secondary Education	2
Grade 4	2
Secondary Education	2
Grade 3	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Two Year Colleges	1
More ▼

Audience

Researchers

Location

California	1
Florida	1
United Kingdom	1

Laws, Policies, & Programs

Assessments and Surveys

Florida Comprehensive…	1
Piers Harris Childrens Self…	1
Tennessee Self Concept Scale	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

Assessment of Interrater and Intermethod Agreement in the Kinesiology Literature

Peer reviewed

Direct link

Looney, Marilyn A. – Measurement in Physical Education and Exercise Science, 2018

The purpose of this article was two-fold (1) provide an overview of the commonly reported and under-reported absolute agreement indices in the kinesiology literature for continuous data; and (2) present examples of these indices for hypothetical data along with recommendations for future use. It is recommended that three types of information be…

Descriptors: Interrater Reliability, Evaluation Methods, Kinetics, Indexes

Psychometric Consequences of Subpopulation Item Parameter Drift

Peer reviewed

Direct link

Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2017

This study defines subpopulation item parameter drift (SIPD) as a change in item parameters over time that is dependent on subpopulations of examinees, and hypothesizes that the presence of SIPD in anchor items is associated with bias and/or lack of invariance in three psychometric outcomes. Results show that SIPD in anchor items is associated…

Descriptors: Psychometrics, Test Items, Item Response Theory, Hypothesis Testing

The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format Test Equating

Peer reviewed
PDF on ERIC

Download full text

Öztürk-Gübes, Nese; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2016

The purpose of this study was to examine the impact of dimensionality, common-item set format, and different scale linking methods on preserving equity property with mixed-format test equating. Item response theory (IRT) true-score equating (TSE) and IRT observed-score equating (OSE) methods were used under common-item nonequivalent groups design.…

Descriptors: Test Format, Item Response Theory, True Scores, Equated Scores

An Investigation of Sample Size Splitting on ATFIND and DIMTEST

Peer reviewed

Direct link

Socha, Alan; DeMars, Christine E. – Educational and Psychological Measurement, 2013

Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…

Descriptors: Sample Size, Test Length, Correlation, Test Format

The Gains from Vertical Scaling

Peer reviewed

Direct link

Briggs, Derek C.; Domingue, Ben – Journal of Educational and Behavioral Statistics, 2013

It is often assumed that a vertical scale is necessary when value-added models depend upon the gain scores of students across two or more points in time. This article examines the conditions under which the scale transformations associated with the vertical scaling process would be expected to have a significant impact on normative interpretations…

Descriptors: Evaluation Methods, Scaling, Scores, Achievement Tests

Developing Assessment Scales for Large-Scale Speaking Tests: A Multiple-Method Approach

Peer reviewed

Direct link

Galaczi, Evelina D.; ffrench, Angela; Hubbard, Chris; Green, Anthony – Assessment in Education: Principles, Policy & Practice, 2011

The process of constructing assessment scales for performance testing is complex and multi-dimensional. As a result, a number of different approaches, both empirically and intuitively based, are open to developers. In this paper we outline the approach taken in the revision of a set of assessment scales used with speaking tests, and present the…

Descriptors: Speech Communication, Research Methodology, Foreign Countries, Statistical Analysis

A Reliability Analysis of Goal Attainment Scaling (GAS) Weights

Peer reviewed

Direct link

Marson, Stephen M.; Wei, Guo; Wasserman, Deborah – American Journal of Evaluation, 2009

Goal attainment scaling (GAS) has been considered to be one of the most versatile and appealing evaluation protocols available for human services. Aspects of the protocol that make the method so appealing to practitioners--that is, collaboratively working with individual clients to identify and assign weights to goals they will work to…

Descriptors: Human Services, Scaling, Test Reliability, Interrater Reliability

Psychophysical Analysis of Audiovisual Judgments of Speech Naturalness of Nonstutterers and Stutterers.

Peer reviewed

Schiavetti, Nicholas; And Others – Journal of Speech and Hearing Research, 1994

This study determined through psychophysical comparison of scaling data that speech naturalness judgments of stutterers and nonstutterers from audiovisual recordings form a metathetic (or qualitative) rather than prothetic (or quantitative) continuum. Both direct magnitude estimation and equal-appearing interval scaling were valid, but interval…

Descriptors: Evaluation Methods, Multidimensional Scaling, Scaling, Speech Evaluation

A Q3 Statistic for Unfolding Item Response Theory Models: Assessment of Unidimensionality with Two Factors and Simple Structure

Peer reviewed

Direct link

Habing, Brian; Finch, Holmes; Roberts, James S. – Applied Psychological Measurement, 2005

Although there are many methods available for dimensionality assessment for items with monotone item response functions, there are few methods available for unfolding item response theory models. In this study, a modification of Yen's Q3 statistic is proposed for the case of these nonmonotone item response models. Through a simulation study, the…

Descriptors: Data Analysis, Simulation, Multidimensional Scaling, Item Response Theory

Using Dimensionality-Based DIF Analyses to Identify and Interpret Constructs That Elicit Group Differences

Peer reviewed

Direct link

Gierl, Mark J. – Educational Measurement: Issues and Practice, 2005

In this paper I describe and illustrate the Roussos-Stout (1996) multidimensionality-based DIF analysis paradigm, with emphasis on its implication for the selection of a matching and studied subtest for DIF analyses. Standard DIF practice encourages an exploratory search for matching subtest items based on purely statistical criteria, such as a…

Descriptors: Models, Test Items, Test Bias, Statistical Analysis

Nonmetric Multidimensional Scaling: An Evaluation of Three Data Collection Methods.

Download full text

Subkoviak, Michael J.; Roecks, Alan L. – 1975

The present study examined three different methods of data collection in which subjects judged proximity between object pairs. One method required subjects to partition objects into homogeneous subsets; the second entailed rating object pairs on a similarity-dissimilarity continuum; and the third involved comparing inter-object proximities to a…

Descriptors: Data Collection, Distance, Evaluation Methods, Higher Education

Utilization of Evaluation Information: A Case Study Approach Investigating Factors Related to Evaluation Utilization in a Large State Agency.

Barrios, Nina B.; Foster, Garrett R. – 1987

The use of evaluation information in a large state agency was investigated through a case study approach. Utilization was measured through a scaling procedure considering the status of recommendations within a report and the extent of influence of the evaluation of the status. The scaling procedure, the equal-appearing interval method, used…

Descriptors: Case Studies, Evaluation Methods, Evaluation Utilization, Information Utilization

The Relation of the Method of Reciprocal Averages to Guttman's Internal Consistency Scaling Model.

Download full text

Baker, Frank B.; Hoyt, Cyril J. – 1972

A scaling technique known as the Method of Reciprocal Averages has been in use since the early 1930's. This technique yields a set of item response weights for a psychological inventory which maximizes the internal consistency of the inventory for a group of subjects. Although the technique has been used for many years, its mathematical…

Descriptors: Analysis of Variance, Correlation, Evaluation Methods, Item Analysis

A Monte Carlo Evaluation Of Three Nonmetric Multidimensional Scaling Algorithms

Peer reviewed

Spence, Ian – Psychometrika, 1972

Discusses the different strategies employed by three practical nonmetric multidimensional scaling algorithms using Monte Carlo techniques. (Author/RK)

Descriptors: Algorithms, Computer Programs, Error of Measurement, Evaluation Methods

Evaluation of the Construction of the Subscales for the Piers-Harris and Tennessee Inventories.

Thomas, Julia Anne – 1985

A sample of 234 fifth- and 259 sixth-grade students scaled the items of the Piers-Harris, Tennessee, Coopersmith, and Lipsett self-concept measures. The scaling of the Piers-Harris and the Tennessee inventories was examined in reference to their subscales. The present technique placed items on a bivariate plane of two orthogonal dimensions…

Descriptors: Evaluation Methods, Factor Structure, Intermediate Grades, Orthogonal Rotation

Previous Page | Next Page »

Pages: 1 | 2

Baker, Frank B.	1
Barrios, Nina B.	1
Briggs, Derek C.	1
DeMars, Christine E.	1
Denison, D. Brian, Ed.	1
Domingue, Ben	1
Finch, Holmes	1
Foster, Garrett R.	1
Galaczi, Evelina D.	1
Gierl, Mark J.	1
Green, Anthony	1
Habing, Brian	1
Harlen, Wynne	1
Hoyt, Cyril J.	1
Hubbard, Chris	1
Huberty, Carl J.	1
Huggins-Manley, Anne Corinne	1
Izard, J. F.	1
Kelecioglu, Hülya	1
Looney, Marilyn A.	1
Marson, Stephen M.	1
Price, Gary G.	1
Roberts, James S.	1
Roecks, Alan L.	1
More ▼