ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	19

Descriptor

Test Theory	63
Test Reliability	20
Scores	17
Error of Measurement	14
Test Items	14
Correlation	13
Mathematical Models	12
Psychometrics	11
True Scores	11
Item Response Theory	10
Statistical Analysis	9
Factor Analysis	8
Models	8
Test Validity	8
Estimation (Mathematics)	7
Item Analysis	7
Statistical Studies	7
Equations (Mathematics)	6
Mathematical Formulas	6
Test Construction	6
Reliability	5
Comparative Analysis	4
Computation	4
Evaluation Methods	4
Measurement	4
More ▼

Source

Educational and Psychological…

Publication Type

Journal Articles	60
Reports - Research	39
Reports - Evaluative	14
Reports - Descriptive	7
Guides - Non-Classroom	2
Speeches/Meeting Papers	2
Historical Materials	1
Opinion Papers	1

Education Level

Higher Education	2
Junior High Schools	1
Postsecondary Education	1

Audience

Practitioners

Location

Australia	1
Canada	1
Taiwan	1

Laws, Policies, & Programs

Assessments and Surveys

Childrens Depression Inventory	1
Eysenck Personality Inventory	1
Law School Admission Test	1
Learning and Study Strategies…	1
SAT (College Admission Test)	1
Sixteen Personality Factor…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 63 results Save | Export

Added Value of Subscores for Tests with Polytomous Items

Peer reviewed

Direct link

Kylie Gorney; Sandip Sinharay – Educational and Psychological Measurement, 2025

Test-takers, policymakers, teachers, and institutions are increasingly demanding that testing programs provide more detailed feedback regarding test performance. As a result, there has been a growing interest in the reporting of subscores that potentially provide such detailed feedback. Haberman developed a method based on classical test theory…

Descriptors: Scores, Test Theory, Test Items, Testing

Using Differential Item Functioning to Test for Interrater Reliability in Constructed Response Items

Peer reviewed

Direct link

Walker, Cindy M.; Göçer Sahin, Sakine – Educational and Psychological Measurement, 2020

The purpose of this study was to investigate a new way of evaluating interrater reliability that can allow one to determine if two raters differ with respect to their rating on a polytomous rating scale or constructed response item. Specifically, differential item functioning (DIF) analyses were used to assess interrater reliability and compared…

Descriptors: Test Bias, Interrater Reliability, Responses, Correlation

On Misconceptions and the Limited Usefulness of Ordinal Alpha

Peer reviewed

Direct link

Chalmers, R. Philip – Educational and Psychological Measurement, 2018

This article discusses the theoretical and practical contributions of Zumbo, Gadermann, and Zeisser's family of ordinal reliability statistics. Implications, interpretation, recommendations, and practical applications regarding their ordinal measures, particularly ordinal alpha, are discussed. General misconceptions relating to this family of…

Descriptors: Misconceptions, Test Theory, Test Reliability, Statistics

A Simple Model to Determine the Efficient Duration of Exams

Peer reviewed

Direct link

Ellis, Jules L. – Educational and Psychological Measurement, 2021

This study develops a theoretical model for the costs of an exam as a function of its duration. Two kind of costs are distinguished: (1) the costs of measurement errors and (2) the costs of the measurement. Both costs are expressed in time of the student. Based on a classical test theory model, enriched with assumptions on the context, the costs…

Descriptors: Test Length, Models, Error of Measurement, Measurement

A Measurement Is a Choice and Stevens' Scales of Measurement Do Not Help Make It: A Response to Chalmers

Peer reviewed

Direct link

Zumbo, Bruno D.; Kroc, Edward – Educational and Psychological Measurement, 2019

Chalmers recently published a critique of the use of ordinal a[alpha] proposed in Zumbo et al. as a measure of test reliability in certain research settings. In this response, we take up the task of refuting Chalmers' critique. We identify three broad misconceptions that characterize Chalmers' criticisms: (1) confusing assumptions with…

Descriptors: Test Reliability, Statistical Analysis, Misconceptions, Mathematical Models

On True Score Evaluation Using Item Response Theory Modeling

Peer reviewed

Direct link

Raykov, Tenko; Dimitrov, Dimiter M.; Marcoulides, George A.; Harrison, Michael – Educational and Psychological Measurement, 2019

Building on prior research on the relationships between key concepts in item response theory and classical test theory, this note contributes to highlighting their important and useful links. A readily and widely applicable latent variable modeling procedure is discussed that can be used for point and interval estimation of the individual person…

Descriptors: True Scores, Item Response Theory, Test Items, Test Theory

Modifying Spearman's Attenuation Equation to Yield Partial Corrections for Measurement Error--With Application to Sample Size Calculations

Peer reviewed

Direct link

Nicewander, W. Alan – Educational and Psychological Measurement, 2018

Spearman's correction for attenuation (measurement error) corrects a correlation coefficient for measurement errors in either-or-both of two variables, and follows from the assumptions of classical test theory. Spearman's equation removes all measurement error from a correlation coefficient which translates into "increasing the reliability of…

Descriptors: Error of Measurement, Correlation, Sample Size, Computation

The Importance of the Assumption of Uncorrelated Errors in Psychometric Theory

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A.; Patelis, Thanos – Educational and Psychological Measurement, 2015

A critical discussion of the assumption of uncorrelated errors in classical psychometric theory and its applications is provided. It is pointed out that this assumption is essential for a number of fundamental results and underlies the concept of parallel tests, the Spearman-Brown's prophecy and the correction for attenuation formulas as well as…

Descriptors: Psychometrics, Correlation, Validity, Reliability

Relationships among Classical Test Theory and Item Response Theory Frameworks via Factor Analytic Models

Peer reviewed

Direct link

Kohli, Nidhi; Koran, Jennifer; Henn, Lisa – Educational and Psychological Measurement, 2015

There are well-defined theoretical differences between the classical test theory (CTT) and item response theory (IRT) frameworks. It is understood that in the CTT framework, person and item statistics are test- and sample-dependent. This is not the perception with IRT. For this reason, the IRT framework is considered to be theoretically superior…

Descriptors: Test Theory, Item Response Theory, Factor Analysis, Models

On the Relationship between Classical Test Theory and Item Response Theory: From One to the Other and Back

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2016

The frequently neglected and often misunderstood relationship between classical test theory and item response theory is discussed for the unidimensional case with binary measures and no guessing. It is pointed out that popular item response models can be directly obtained from classical test theory-based models by accounting for the discrete…

Descriptors: Test Theory, Item Response Theory, Models, Correlation

Maximum Likelihood Item Easiness Models for Test Theory without an Answer Key

Peer reviewed

Direct link

France, Stephen L.; Batchelder, William H. – Educational and Psychological Measurement, 2015

Cultural consensus theory (CCT) is a data aggregation technique with many applications in the social and behavioral sciences. We describe the intuition and theory behind a set of CCT models for continuous type data using maximum likelihood inference methodology. We describe how bias parameters can be incorporated into these models. We introduce…

Descriptors: Maximum Likelihood Statistics, Test Items, Difficulty Level, Test Theory

Measurement Error Correction Formula for Cluster-Level Group Differences in Cluster Randomized and Observational Studies

Peer reviewed

Direct link

Cho, Sun-Joo; Preacher, Kristopher J. – Educational and Psychological Measurement, 2016

Multilevel modeling (MLM) is frequently used to detect cluster-level group differences in cluster randomized trial and observational studies. Group differences on the outcomes (posttest scores) are detected by controlling for the covariate (pretest scores) as a proxy variable for unobserved factors that predict future attributes. The pretest and…

Descriptors: Error of Measurement, Error Correction, Multivariate Analysis, Hierarchical Linear Modeling

Using IRT Trait Estimates versus Summated Scores in Predicting Outcomes

Peer reviewed

Direct link

Xu, Ting; Stone, Clement A. – Educational and Psychological Measurement, 2012

It has been argued that item response theory trait estimates should be used in analyses rather than number right (NR) or summated scale (SS) scores. Thissen and Orlando postulated that IRT scaling tends to produce trait estimates that are linearly related to the underlying trait being measured. Therefore, IRT trait estimates can be more useful…

Descriptors: Educational Research, Monte Carlo Methods, Measures (Individuals), Item Response Theory

Sources of Validity Evidence for Educational and Psychological Tests

Peer reviewed

Direct link

Cizek, Gregory J.; Rosenberg, Sharyn L.; Koons, Heather H. – Educational and Psychological Measurement, 2008

This study investigates aspects of validity reflected in a large and diverse sample of published measures used in educational and psychological testing contexts. The current edition of "Mental Measurements Yearbook" served as the data source for this study. The validity aspects investigated included perspective on validity represented, number and…

Descriptors: Psychological Testing, Test Validity, Testing, Test Theory

Polytomous Differential Item Functioning and Violations of Ordering of the Expected Latent Trait by the Raw Score

Peer reviewed

Direct link

DeMars, Christine E. – Educational and Psychological Measurement, 2008

The graded response (GR) and generalized partial credit (GPC) models do not imply that examinees ordered by raw observed score will necessarily be ordered on the expected value of the latent trait (OEL). Factors were manipulated to assess whether increased violations of OEL also produced increased Type I error rates in differential item…

Descriptors: Test Items, Raw Scores, Test Theory, Error of Measurement

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Marcoulides, George A.	3
Raykov, Tenko	3
Wilcox, Rand R.	3
Zimmerman, Donald W.	3
Yarnold, Paul R.	2
Zumbo, Bruno D.	2
Adams, Katharine	1
Adler, Nurit	1
Aleamoni, Lawrence, M.	1
Alliger, George M.	1
Andreou, Pantelis	1
Banerji, Madhabi	1
Batchelder, William H.	1
Belfer, Sharon E.	1
Bell, Karen N.	1
Blixt, Sonya L.	1
Bruno, James E.	1
Burry-Stock, Judith A.	1
Cahan, Sorel	1
Campbell, J. F.	1
Cattell, Raymond B.	1
Chalmers, R. Philip	1
Chang, Shun-Wen	1
Cho, Sun-Joo	1
More ▼