ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	9

Descriptor

Psychometrics	15
Test Validity	8
Models	7
Scores	5
Validity	5
Comparative Analysis	3
Error of Measurement	3
Item Response Theory	3
Test Construction	3
Achievement Tests	2
Bias	2
Cognitive Processes	2
Correlation	2
Factor Analysis	2
Goodness of Fit	2
Intelligence Tests	2
Measurement Techniques	2
Predictive Validity	2
Predictor Variables	2
Selection	2
Statistical Analysis	2
Statistics	2
Test Bias	2
Test Reliability	2
Test Use	2
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	13
Reports - Research	7
Reports - Evaluative	4
Information Analyses	1
Opinion Papers	1
Reports - Descriptive	1

Education Level

Higher Education	2
Postsecondary Education	2
Secondary Education	2
Elementary Secondary Education	1
High Schools	1
Junior High Schools	1
Middle Schools	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Kaufman Assessment Battery…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 15 results Save | Export

A Note on Latent Traits Estimates under IRT Models with Missingness

Peer reviewed

Direct link

Guo, Jinxin; Xu, Xin; Xin, Tao – Journal of Educational Measurement, 2023

Missingness due to not-reached items and omitted items has received much attention in the recent psychometric literature. Such missingness, if not handled properly, would lead to biased parameter estimation, as well as inaccurate inference of examinees, and further erode the validity of the test. This paper reviews some commonly used IRT based…

Descriptors: Psychometrics, Bias, Error of Measurement, Test Validity

Using Multilabel Neural Network to Score High-Dimensional Assessments for Different Use Foci: An Example with College Major Preference Assessment

Peer reviewed

Direct link

Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025

Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…

Descriptors: Tests, Testing, Scores, Test Construction

Integrating Multiple Sources of Validity Evidence for an Assessment-Based Cognitive Model

Peer reviewed

Direct link

Langenfeld, Thomas; Thomas, Jay; Zhu, Rongchun; Morris, Carrie A. – Journal of Educational Measurement, 2020

An assessment of graphic literacy was developed by articulating and subsequently validating a skills-based cognitive model intended to substantiate the plausibility of score interpretations. Model validation involved use of multiple sources of evidence derived from large-scale field testing and cognitive labs studies. Data from large-scale field…

Descriptors: Evidence, Scores, Eye Movements, Psychometrics

A Comparison of Experimental and Observational Approaches to Assessing the Effects of Time Constraints in a Medical Licensing Examination

Peer reviewed

Direct link

Harik, Polina; Clauser, Brian E.; Grabovsky, Irina; Baldwin, Peter; Margolis, Melissa J.; Bucak, Deniz; Jodoin, Michael; Walsh, William; Haist, Steven – Journal of Educational Measurement, 2018

Test administrators are appropriately concerned about the potential for time constraints to impact the validity of score interpretations; psychometric efforts to evaluate the impact of speededness date back more than half a century. The widespread move to computerized test delivery has led to the development of new approaches to evaluating how…

Descriptors: Comparative Analysis, Observation, Medical Education, Licensing Examinations (Professions)

Relative and Absolute Fit Evaluation in Cognitive Diagnosis Modeling

Peer reviewed

Direct link

Chen, Jinsong; de la Torre, Jimmy; Zhang, Zao – Journal of Educational Measurement, 2013

As with any psychometric models, the validity of inferences from cognitive diagnosis models (CDMs) determines the extent to which these models can be useful. For inferences from CDMs to be valid, it is crucial that the fit of the model to the data is ascertained. Based on a simulation study, this study investigated the sensitivity of various fit…

Descriptors: Models, Psychometrics, Goodness of Fit, Statistical Analysis

Relating Unidimensional IRT Parameters to a Multidimensional Response Space: A Review of Two Alternative Projection IRT Models for Scoring Subscales

Peer reviewed

Direct link

Kahraman, Nilufer; Thompson, Tony – Journal of Educational Measurement, 2011

A practical concern for many existing tests is that subscore test lengths are too short to provide reliable and meaningful measurement. A possible method of improving the subscale reliability and validity would be to make use of collateral information provided by items from other subscales of the same test. To this end, the purpose of this article…

Descriptors: Test Length, Test Items, Alignment (Education), Models

The Effect of High School Socioeconomic Status on the Predictive Validity of SAT Scores and High School Grade-Point Average

Peer reviewed

Direct link

Zwick, Rebecca; Himelfarb, Igor – Journal of Educational Measurement, 2011

Research has often found that, when high school grades and SAT scores are used to predict first-year college grade-point average (FGPA) via regression analysis, African-American and Latino students, are, on average, predicted to earn higher FGPAs than they actually do. Under various plausible models, this phenomenon can be explained in terms of…

Descriptors: Socioeconomic Status, Grades (Scholastic), Error of Measurement, White Students

Skills Diagnosis Using IRT-Based Latent Class Models

Peer reviewed

Direct link

Roussos, Louis A.; Templin, Jonathan L.; Henson, Robert A. – Journal of Educational Measurement, 2007

This article describes a latent trait approach to skills diagnosis based on a particular variety of latent class models that employ item response functions (IRFs) as in typical item response theory (IRT) models. To enable and encourage comparisons with other approaches, this description is provided in terms of the main components of any…

Descriptors: Validity, Identification, Psychometrics, Item Response Theory

Equity in Selection--Where Psychometrics and Political Philosophy Meet

Peer reviewed

Cronbach, Lee J. – Journal of Educational Measurement, 1976

The Petersen-Novick paper dealing with culture fair selection (TM 502 259) is the basis for this article. The author proposes a perspective in which ideas can be lined up for comparison and suggests solutions to the problems of selection in employment. (DEP)

Descriptors: Bias, Employment Opportunities, Matrices, Models

A Comprehensive System for Item Analysis in Psychological Scale Construction

Peer reviewed

Schwartz, Steven A. – Journal of Educational Measurement, 1978

A method for the construction of scales which combines the rational (or intuitive) approach with an empirical (item analysis) approach is presented. A step-by-step procedure is provided. (Author/JKS)

Descriptors: Factor Analysis, Item Analysis, Measurement, Psychological Testing

An Application of Item Response Time: The Effort-Moderated IRT Model

Peer reviewed

Direct link

Wise, Steven L.; DeMars, Christine E. – Journal of Educational Measurement, 2006

The validity of inferences based on achievement test scores is dependent on the amount of effort that examinees put forth while taking the test. With low-stakes tests, for which this problem is particularly prevalent, there is a consequent need for psychometric models that can take into account differing levels of examinee effort. This article…

Descriptors: Guessing (Tests), Psychometrics, Inferences, Reaction Time

Evaluating Comparability in Computerized Adaptive Testing: Issues, Criteria and an Example.

Peer reviewed

Wang, Tianyou; Kolen, Michael J. – Journal of Educational Measurement, 2001

Reviews research literature on comparability issues in computerized adaptive testing (CAT) and synthesizes issues specific to comparability and test security. Develops a framework for evaluating comparability that contains three categories of criteria: (1) validity; (2) psychometric property/reliability; and (3) statistical assumption/test…

Descriptors: Adaptive Testing, Comparative Analysis, Computer Assisted Testing, Criteria

Factor Analysis of the K-ABC for Groups of Anglo and Mexican American Children.

Peer reviewed

Valencia, Richard R.; Rankin, Richard J. – Journal of Educational Measurement, 1986

Factor analyses of the Kaufman Assessment Battery for Children (K-ABC) were performed on separate groups of Anglo (n=100) and Mexican-American (n=100) fifth-grade children to determine the comparability of underlying structures and to examine the existence of possible bias in construct validity of the K-ABC for each group. (Author/LMO)

Descriptors: Achievement Tests, Cognitive Processes, Elementary Education, Factor Analysis

Multiple Processing Strategies and the Construct Validity of Verbal Reasoning Tests.

Peer reviewed

Embretson, Susan; And Others – Journal of Educational Measurement, 1986

This study examined the influence of processing strategies, and the metacomponents that determine when to apply them, on the construct validity of a verbal reasoning test. A rule-oriented strategy, an association strategy, and a partial rule strategy were examined. All three strategies contributed to individual differences in verbal reasoning.…

Descriptors: Cognitive Processes, Elementary Secondary Education, Error of Measurement, Latent Trait Theory

Measurement Problems.

Peer reviewed

Wainer, Howard – Journal of Educational Measurement, 1993

Focusing on educational measurement that suggests an action and has an outcome, 16 problem areas are defined and grouped into the following classes: (1) validity; (2) issues of statistical adjustment; (3) data insufficiencies; (4) other issues related to standardized testing and constructed responses; and (5) technical issues of psychometrics.…

Descriptors: Comparative Analysis, Computer Uses in Education, Constructed Response, Educational Assessment

Amery D. Wu	1
Baldwin, Peter	1
Bucak, Deniz	1
Chen, Jinsong	1
Clauser, Brian E.	1
Cronbach, Lee J.	1
DeMars, Christine E.	1
Embretson, Susan	1
Grabovsky, Irina	1
Guo, Jinxin	1
Haist, Steven	1
Harik, Polina	1
Henson, Robert A.	1
Himelfarb, Igor	1
Jake Stone	1
Jodoin, Michael	1
Kahraman, Nilufer	1
Kolen, Michael J.	1
Langenfeld, Thomas	1
Margolis, Melissa J.	1
Morris, Carrie A.	1
Rankin, Richard J.	1
Roussos, Louis A.	1
Schwartz, Steven A.	1
More ▼