Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 7 |
Descriptor
Validity | 14 |
Models | 12 |
Psychometrics | 5 |
Comparative Analysis | 3 |
Item Response Theory | 3 |
Mathematics | 3 |
Statistical Analysis | 3 |
Bayesian Statistics | 2 |
Bias | 2 |
Computer Assisted Testing | 2 |
Data Analysis | 2 |
More ▼ |
Source
Journal of Educational… | 14 |
Author
de la Torre, Jimmy | 2 |
Airasian, Peter W. | 1 |
Bart, William M. | 1 |
Bejar, Isaac I. | 1 |
Carstensen, Claus H. | 1 |
Chen, Jinsong | 1 |
Choi, In-Hee | 1 |
Cronbach, Lee J. | 1 |
Gross, Alan L. | 1 |
Hanna, Gila | 1 |
Henson, Robert A. | 1 |
More ▼ |
Publication Type
Journal Articles | 11 |
Reports - Research | 7 |
Reports - Descriptive | 2 |
Information Analyses | 1 |
Reports - Evaluative | 1 |
Education Level
Middle Schools | 2 |
Junior High Schools | 1 |
Secondary Education | 1 |
Audience
Researchers | 1 |
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Sooyong Lee; Suhwa Han; Seung W. Choi – Journal of Educational Measurement, 2024
Research has shown that multiple-indicator multiple-cause (MIMIC) models can result in inflated Type I error rates in detecting differential item functioning (DIF) when the assumption of equal latent variance is violated. This study explains how the violation of the equal variance assumption adversely impacts the detection of nonuniform DIF and…
Descriptors: Factor Analysis, Bayesian Statistics, Test Bias, Item Response Theory
Langenfeld, Thomas; Thomas, Jay; Zhu, Rongchun; Morris, Carrie A. – Journal of Educational Measurement, 2020
An assessment of graphic literacy was developed by articulating and subsequently validating a skills-based cognitive model intended to substantiate the plausibility of score interpretations. Model validation involved use of multiple sources of evidence derived from large-scale field testing and cognitive labs studies. Data from large-scale field…
Descriptors: Evidence, Scores, Eye Movements, Psychometrics
Köhler, Carmen; Pohl, Steffi; Carstensen, Claus H. – Journal of Educational Measurement, 2017
Competence data from low-stakes educational large-scale assessment studies allow for evaluating relationships between competencies and other variables. The impact of item-level nonresponse has not been investigated with regard to statistics that determine the size of these relationships (e.g., correlations, regression coefficients). Classical…
Descriptors: Test Items, Cognitive Measurement, Testing Problems, Regression (Statistics)
Shin, Hyo Jeong; Wilson, Mark; Choi, In-Hee – Journal of Educational Measurement, 2017
This study proposes a structured constructs model (SCM) to examine measurement in the context of a multidimensional learning progression (LP). The LP is assumed to have features that go beyond a typical multidimentional IRT model, in that there are hypothesized to be certain cross-dimensional linkages that correspond to requirements between the…
Descriptors: Middle School Students, Student Evaluation, Measurement Techniques, Learning Processes
Chen, Jinsong; de la Torre, Jimmy; Zhang, Zao – Journal of Educational Measurement, 2013
As with any psychometric models, the validity of inferences from cognitive diagnosis models (CDMs) determines the extent to which these models can be useful. For inferences from CDMs to be valid, it is crucial that the fit of the model to the data is ascertained. Based on a simulation study, this study investigated the sensitivity of various fit…
Descriptors: Models, Psychometrics, Goodness of Fit, Statistical Analysis
de la Torre, Jimmy – Journal of Educational Measurement, 2008
Most model fit analyses in cognitive diagnosis assume that a Q matrix is correct after it has been constructed, without verifying its appropriateness. Consequently, any model misfit attributable to the Q matrix cannot be addressed and remedied. To address this concern, this paper proposes an empirically based method of validating a Q matrix used…
Descriptors: Matrices, Validity, Models, Evaluation Methods
Roussos, Louis A.; Templin, Jonathan L.; Henson, Robert A. – Journal of Educational Measurement, 2007
This article describes a latent trait approach to skills diagnosis based on a particular variety of latent class models that employ item response functions (IRFs) as in typical item response theory (IRT) models. To enable and encourage comparisons with other approaches, this description is provided in terms of the main components of any…
Descriptors: Validity, Identification, Psychometrics, Item Response Theory

Novick, Melvin R.; Lindley, Dennis V. – Journal of Educational Measurement, 1978
The use of some very simple loss or utility functions in educational evaluation has recently been advocated by Gross and Su, Petersen and Novick, and Petersen. This paper demonstrates that more realistic utility functions can easily be used and may be preferable in some applications. (Author/CTM)
Descriptors: Bayesian Statistics, Cost Effectiveness, Mathematical Models, Statistical Analysis

Cronbach, Lee J. – Journal of Educational Measurement, 1976
The Petersen-Novick paper dealing with culture fair selection (TM 502 259) is the basis for this article. The author proposes a perspective in which ideas can be lined up for comparison and suggests solutions to the problems of selection in employment. (DEP)
Descriptors: Bias, Employment Opportunities, Matrices, Models

Airasian, Peter W.; Bart, William M. – Journal of Educational Measurement, 1975
Validation studies of learning hierarchies usually examine whether task relationships posited a priori are confirmed by student learning data. This method was compared with a non-posited task relationship where all possible task relationships were generated and investigated. A learning hierarchy in a seventh grade mathematics study reported by…
Descriptors: Difficulty Level, Intellectual Development, Junior High Schools, Learning Theories

Williamson, David M.; Bejar, Isaac I.; Hone, Anne S. – Journal of Educational Measurement, 1999
Contrasts "mental models" used by automated scoring for the simulation division of the computerized Architect Registration Examination with those used by experienced human graders for 3,613 candidate solutions. Discusses differences in the models used and the potential of automated scoring to enhance the validity evidence of scores. (SLD)
Descriptors: Architects, Comparative Analysis, Computer Assisted Testing, Judges

Gross, Alan L.; Shulman, Vivian – Journal of Educational Measurement, 1980
The suitability of the beta binomial test model for criterion referenced testing was investigated, first by considering whether underlying assumptions are realistic, and second, by examining the robustness of the model. Results suggest that the model may have practical value. (Author/RD)
Descriptors: Criterion Referenced Tests, Goodness of Fit, Higher Education, Item Sampling

Hanna, Gila – Journal of Educational Measurement, 1984
The validity of a comparison of mean test scores for two groups and of a longitudinal comparison of means within each group is assessed. Using LISREL, factor analyses are used to test the hypotheses of similar factor patterns, equal units of measurement, and equal measurement accuracy between groups and across time. (Author/DWH)
Descriptors: Achievement Tests, Comparative Analysis, Data Analysis, Factor Analysis

Wang, Tianyou; Kolen, Michael J. – Journal of Educational Measurement, 2001
Reviews research literature on comparability issues in computerized adaptive testing (CAT) and synthesizes issues specific to comparability and test security. Develops a framework for evaluating comparability that contains three categories of criteria: (1) validity; (2) psychometric property/reliability; and (3) statistical assumption/test…
Descriptors: Adaptive Testing, Comparative Analysis, Computer Assisted Testing, Criteria