Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 9 |
Descriptor
Psychometrics | 15 |
Test Validity | 8 |
Models | 7 |
Scores | 5 |
Validity | 5 |
Comparative Analysis | 3 |
Error of Measurement | 3 |
Item Response Theory | 3 |
Test Construction | 3 |
Achievement Tests | 2 |
Bias | 2 |
More ▼ |
Source
Journal of Educational… | 15 |
Author
Amery D. Wu | 1 |
Baldwin, Peter | 1 |
Bucak, Deniz | 1 |
Chen, Jinsong | 1 |
Clauser, Brian E. | 1 |
Cronbach, Lee J. | 1 |
DeMars, Christine E. | 1 |
Embretson, Susan | 1 |
Grabovsky, Irina | 1 |
Guo, Jinxin | 1 |
Haist, Steven | 1 |
More ▼ |
Publication Type
Journal Articles | 13 |
Reports - Research | 7 |
Reports - Evaluative | 4 |
Information Analyses | 1 |
Opinion Papers | 1 |
Reports - Descriptive | 1 |
Education Level
Higher Education | 2 |
Postsecondary Education | 2 |
Secondary Education | 2 |
Elementary Secondary Education | 1 |
High Schools | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Kaufman Assessment Battery… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Guo, Jinxin; Xu, Xin; Xin, Tao – Journal of Educational Measurement, 2023
Missingness due to not-reached items and omitted items has received much attention in the recent psychometric literature. Such missingness, if not handled properly, would lead to biased parameter estimation, as well as inaccurate inference of examinees, and further erode the validity of the test. This paper reviews some commonly used IRT based…
Descriptors: Psychometrics, Bias, Error of Measurement, Test Validity
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Langenfeld, Thomas; Thomas, Jay; Zhu, Rongchun; Morris, Carrie A. – Journal of Educational Measurement, 2020
An assessment of graphic literacy was developed by articulating and subsequently validating a skills-based cognitive model intended to substantiate the plausibility of score interpretations. Model validation involved use of multiple sources of evidence derived from large-scale field testing and cognitive labs studies. Data from large-scale field…
Descriptors: Evidence, Scores, Eye Movements, Psychometrics
Harik, Polina; Clauser, Brian E.; Grabovsky, Irina; Baldwin, Peter; Margolis, Melissa J.; Bucak, Deniz; Jodoin, Michael; Walsh, William; Haist, Steven – Journal of Educational Measurement, 2018
Test administrators are appropriately concerned about the potential for time constraints to impact the validity of score interpretations; psychometric efforts to evaluate the impact of speededness date back more than half a century. The widespread move to computerized test delivery has led to the development of new approaches to evaluating how…
Descriptors: Comparative Analysis, Observation, Medical Education, Licensing Examinations (Professions)
Chen, Jinsong; de la Torre, Jimmy; Zhang, Zao – Journal of Educational Measurement, 2013
As with any psychometric models, the validity of inferences from cognitive diagnosis models (CDMs) determines the extent to which these models can be useful. For inferences from CDMs to be valid, it is crucial that the fit of the model to the data is ascertained. Based on a simulation study, this study investigated the sensitivity of various fit…
Descriptors: Models, Psychometrics, Goodness of Fit, Statistical Analysis
Kahraman, Nilufer; Thompson, Tony – Journal of Educational Measurement, 2011
A practical concern for many existing tests is that subscore test lengths are too short to provide reliable and meaningful measurement. A possible method of improving the subscale reliability and validity would be to make use of collateral information provided by items from other subscales of the same test. To this end, the purpose of this article…
Descriptors: Test Length, Test Items, Alignment (Education), Models
Zwick, Rebecca; Himelfarb, Igor – Journal of Educational Measurement, 2011
Research has often found that, when high school grades and SAT scores are used to predict first-year college grade-point average (FGPA) via regression analysis, African-American and Latino students, are, on average, predicted to earn higher FGPAs than they actually do. Under various plausible models, this phenomenon can be explained in terms of…
Descriptors: Socioeconomic Status, Grades (Scholastic), Error of Measurement, White Students
Roussos, Louis A.; Templin, Jonathan L.; Henson, Robert A. – Journal of Educational Measurement, 2007
This article describes a latent trait approach to skills diagnosis based on a particular variety of latent class models that employ item response functions (IRFs) as in typical item response theory (IRT) models. To enable and encourage comparisons with other approaches, this description is provided in terms of the main components of any…
Descriptors: Validity, Identification, Psychometrics, Item Response Theory

Cronbach, Lee J. – Journal of Educational Measurement, 1976
The Petersen-Novick paper dealing with culture fair selection (TM 502 259) is the basis for this article. The author proposes a perspective in which ideas can be lined up for comparison and suggests solutions to the problems of selection in employment. (DEP)
Descriptors: Bias, Employment Opportunities, Matrices, Models

Schwartz, Steven A. – Journal of Educational Measurement, 1978
A method for the construction of scales which combines the rational (or intuitive) approach with an empirical (item analysis) approach is presented. A step-by-step procedure is provided. (Author/JKS)
Descriptors: Factor Analysis, Item Analysis, Measurement, Psychological Testing
Wise, Steven L.; DeMars, Christine E. – Journal of Educational Measurement, 2006
The validity of inferences based on achievement test scores is dependent on the amount of effort that examinees put forth while taking the test. With low-stakes tests, for which this problem is particularly prevalent, there is a consequent need for psychometric models that can take into account differing levels of examinee effort. This article…
Descriptors: Guessing (Tests), Psychometrics, Inferences, Reaction Time

Wang, Tianyou; Kolen, Michael J. – Journal of Educational Measurement, 2001
Reviews research literature on comparability issues in computerized adaptive testing (CAT) and synthesizes issues specific to comparability and test security. Develops a framework for evaluating comparability that contains three categories of criteria: (1) validity; (2) psychometric property/reliability; and (3) statistical assumption/test…
Descriptors: Adaptive Testing, Comparative Analysis, Computer Assisted Testing, Criteria

Valencia, Richard R.; Rankin, Richard J. – Journal of Educational Measurement, 1986
Factor analyses of the Kaufman Assessment Battery for Children (K-ABC) were performed on separate groups of Anglo (n=100) and Mexican-American (n=100) fifth-grade children to determine the comparability of underlying structures and to examine the existence of possible bias in construct validity of the K-ABC for each group. (Author/LMO)
Descriptors: Achievement Tests, Cognitive Processes, Elementary Education, Factor Analysis

Embretson, Susan; And Others – Journal of Educational Measurement, 1986
This study examined the influence of processing strategies, and the metacomponents that determine when to apply them, on the construct validity of a verbal reasoning test. A rule-oriented strategy, an association strategy, and a partial rule strategy were examined. All three strategies contributed to individual differences in verbal reasoning.…
Descriptors: Cognitive Processes, Elementary Secondary Education, Error of Measurement, Latent Trait Theory

Wainer, Howard – Journal of Educational Measurement, 1993
Focusing on educational measurement that suggests an action and has an outcome, 16 problem areas are defined and grouped into the following classes: (1) validity; (2) issues of statistical adjustment; (3) data insufficiencies; (4) other issues related to standardized testing and constructed responses; and (5) technical issues of psychometrics.…
Descriptors: Comparative Analysis, Computer Uses in Education, Constructed Response, Educational Assessment