ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	5

Source

Applied Measurement in…

Author

Beretvas, S. Natasha	1
Blackman, Harold S.	1
Brennan, Robert L.	1
Cao, Yi	1
Downing, Steven M.	1
Feldt, Leonard S.	1
Haberman, Shelby	1
Haladyna, Thomas M.	1
Hau, Kit-Tai	1
Krus, David J.	1
Larkin, Kevin	1
Loyd, Brenda H.	1
Mitchell, James V., Jr.	1
Murphy, Daniel L.	1
Puhan, Gautam	1
Sinharay, Sandip	1
Tao, Wei	1
Xiao, Leifeng	1
More ▼

Publication Type

Journal Articles	10
Reports - Research	5
Reports - Evaluative	4
Opinion Papers	1

Education Level

Elementary Education	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Colorado	1
Florida	1
New York	1
North Carolina	1
Tennessee	1
Texas	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Accuracy and Sensitivity of Coefficient Alpha and Its Alternatives with Unidimensional and Contaminated Scales

Peer reviewed

Direct link

Xiao, Leifeng; Hau, Kit-Tai – Applied Measurement in Education, 2023

We compared coefficient alpha with five alternatives (omega total, omega RT, omega h, GLB, and coefficient H) in two simulation studies. Results showed for unidimensional scales, (a) all indices except omega h performed similarly well for most conditions; (b) alpha is still good; (c) GLB and coefficient H overestimated reliability with small…

Descriptors: Test Theory, Test Reliability, Factor Analysis, Test Length

An Extension of IRT-Based Equating to the Dichotomous Testlet Response Theory Model

Peer reviewed

Direct link

Tao, Wei; Cao, Yi – Applied Measurement in Education, 2016

Current procedures for equating number-correct scores using traditional item response theory (IRT) methods assume local independence. However, when tests are constructed using testlets, one concern is the violation of the local item independence assumption. The testlet response theory (TRT) model is one way to accommodate local item dependence.…

Descriptors: Item Response Theory, Equated Scores, Test Format, Models

A Comparison of Teacher Effectiveness Measures Calculated Using Three Multilevel Models for Raters Effects

Peer reviewed

Direct link

Murphy, Daniel L.; Beretvas, S. Natasha – Applied Measurement in Education, 2015

This study examines the use of cross-classified random effects models (CCrem) and cross-classified multiple membership random effects models (CCMMrem) to model rater bias and estimate teacher effectiveness. Effect estimates are compared using CTT versus item response theory (IRT) scaling methods and three models (i.e., conventional multilevel…

Descriptors: Teacher Effectiveness, Comparative Analysis, Hierarchical Linear Modeling, Test Theory

Generalizability Theory and Classical Test Theory

Peer reviewed

Direct link

Brennan, Robert L. – Applied Measurement in Education, 2011

Broadly conceived, reliability involves quantifying the consistencies and inconsistencies in observed scores. Generalizability theory, or G theory, is particularly well suited to addressing such matters in that it enables an investigator to quantify and distinguish the sources of inconsistencies in observed scores that arise, or could arise, over…

Descriptors: Generalizability Theory, Test Theory, Test Reliability, Item Response Theory

The Utility of Augmented Subscores in a Licensure Exam: An Evaluation of Methods Using Empirical Data

Peer reviewed

Direct link

Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010

Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…

Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods

Test Reliability and Homogeneity from the Perspective of the Ordinal Test Theory.

Peer reviewed

Krus, David J.; Blackman, Harold S. – Applied Measurement in Education, 1988

Test homogeneity and internal consistency reliability indices were developed on the basis of theoretical considerations of properties of hierarchical structures of data matrices. This reconceptualization, in terms of ordinal test theory, has potential for explication of the mutual relationship of test reliability and homogeneity. (TJH)

Descriptors: Equations (Mathematics), Statistics, Test Reliability, Test Theory

Can Validity Rise When Reliability Declines?

Peer reviewed

Feldt, Leonard S. – Applied Measurement in Education, 1997

It has often been asserted that the reliability of a measure places an upper limit on its validity. This article demonstrates in theory that validity can rise when reliability declines, even when validity evidence is a correlation with an acceptable criterion. Whether empirical examples can actually be found is an open question. (SLD)

Descriptors: Correlation, Criteria, Reliability, Test Construction

Implications of Item Response Theory for the Measurement Practitioner.

Peer reviewed

Loyd, Brenda H. – Applied Measurement in Education, 1988

The impact of item response theory (IRT) on the measurement practitioner is discussed, with a review of potential benefits. The complexity of IRT theory and procedures and the lack of robustness of IRT procedures to violation of assumptions must be recognized for the measurement practitioner to realize its advantages. (SLD)

Descriptors: Educational Researchers, Evaluation Methods, Evaluators, Latent Trait Theory

A Taxonomy of Multiple-Choice Item-Writing Rules.

Peer reviewed

Haladyna, Thomas M.; Downing, Steven M. – Applied Measurement in Education, 1989

A taxonomy of 43 rules for writing multiple-choice test items is presented, based on a consensus of 46 textbooks. These guidelines are presented as complete and authoritative, with solid consensus apparent for 33 of the rules. Four rules lack consensus, and 5 rules were cited fewer than 10 times. (SLD)

Descriptors: Classification, Interrater Reliability, Multiple Choice Tests, Objective Tests

Applied Measurement in the Oscar Buros Tradition: Current Implications.

Peer reviewed

Mitchell, James V., Jr. – Applied Measurement in Education, 1988

Applications of Oscar K. Buros' values and convictions to current developments in measurement are considered. Biographical information and Buros' personal philosophy on applied measurement are discussed. The Buros tradition refocuses evaluators' attention on the implications of their work for the end users of measurement results--test users and…

Descriptors: Computer Assisted Testing, Educational Assessment, Educational Philosophy, Educational Researchers

Test Theory	10
Comparative Analysis	3
Item Response Theory	3
Test Construction	3
Test Reliability	3
Bias	2
Computation	2
Educational Researchers	2
Error of Measurement	2
Latent Trait Theory	2
Measurement Techniques	2
Reliability	2
Test Format	2
True Scores	2
Accuracy	1
Classification	1
Computer Assisted Testing	1
Correlation	1
Criteria	1
Educational Assessment	1
Educational Philosophy	1
Elementary School Teachers	1
Equated Scores	1
Equations (Mathematics)	1
Evaluation Methods	1
More ▼