NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 20 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sooyong Lee; Suhwa Han; Seung W. Choi – Journal of Educational Measurement, 2024
Research has shown that multiple-indicator multiple-cause (MIMIC) models can result in inflated Type I error rates in detecting differential item functioning (DIF) when the assumption of equal latent variance is violated. This study explains how the violation of the equal variance assumption adversely impacts the detection of nonuniform DIF and…
Descriptors: Factor Analysis, Bayesian Statistics, Test Bias, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Tong Wu; Stella Y. Kim; Carl Westine; Michelle Boyer – Journal of Educational Measurement, 2025
While significant attention has been given to test equating to ensure score comparability, limited research has explored equating methods for rater-mediated assessments, where human raters inherently introduce error. If not properly addressed, these errors can undermine score interchangeability and test validity. This study proposes an equating…
Descriptors: Item Response Theory, Evaluators, Error of Measurement, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Langenfeld, Thomas; Thomas, Jay; Zhu, Rongchun; Morris, Carrie A. – Journal of Educational Measurement, 2020
An assessment of graphic literacy was developed by articulating and subsequently validating a skills-based cognitive model intended to substantiate the plausibility of score interpretations. Model validation involved use of multiple sources of evidence derived from large-scale field testing and cognitive labs studies. Data from large-scale field…
Descriptors: Evidence, Scores, Eye Movements, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Köhler, Carmen; Pohl, Steffi; Carstensen, Claus H. – Journal of Educational Measurement, 2017
Competence data from low-stakes educational large-scale assessment studies allow for evaluating relationships between competencies and other variables. The impact of item-level nonresponse has not been investigated with regard to statistics that determine the size of these relationships (e.g., correlations, regression coefficients). Classical…
Descriptors: Test Items, Cognitive Measurement, Testing Problems, Regression (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Shin, Hyo Jeong; Wilson, Mark; Choi, In-Hee – Journal of Educational Measurement, 2017
This study proposes a structured constructs model (SCM) to examine measurement in the context of a multidimensional learning progression (LP). The LP is assumed to have features that go beyond a typical multidimentional IRT model, in that there are hypothesized to be certain cross-dimensional linkages that correspond to requirements between the…
Descriptors: Middle School Students, Student Evaluation, Measurement Techniques, Learning Processes
Peer reviewed Peer reviewed
Direct linkDirect link
Chen, Jinsong; de la Torre, Jimmy; Zhang, Zao – Journal of Educational Measurement, 2013
As with any psychometric models, the validity of inferences from cognitive diagnosis models (CDMs) determines the extent to which these models can be useful. For inferences from CDMs to be valid, it is crucial that the fit of the model to the data is ascertained. Based on a simulation study, this study investigated the sensitivity of various fit…
Descriptors: Models, Psychometrics, Goodness of Fit, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Kahraman, Nilufer; Thompson, Tony – Journal of Educational Measurement, 2011
A practical concern for many existing tests is that subscore test lengths are too short to provide reliable and meaningful measurement. A possible method of improving the subscale reliability and validity would be to make use of collateral information provided by items from other subscales of the same test. To this end, the purpose of this article…
Descriptors: Test Length, Test Items, Alignment (Education), Models
Peer reviewed Peer reviewed
Direct linkDirect link
de la Torre, Jimmy – Journal of Educational Measurement, 2008
Most model fit analyses in cognitive diagnosis assume that a Q matrix is correct after it has been constructed, without verifying its appropriateness. Consequently, any model misfit attributable to the Q matrix cannot be addressed and remedied. To address this concern, this paper proposes an empirically based method of validating a Q matrix used…
Descriptors: Matrices, Validity, Models, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Roussos, Louis A.; Templin, Jonathan L.; Henson, Robert A. – Journal of Educational Measurement, 2007
This article describes a latent trait approach to skills diagnosis based on a particular variety of latent class models that employ item response functions (IRFs) as in typical item response theory (IRT) models. To enable and encourage comparisons with other approaches, this description is provided in terms of the main components of any…
Descriptors: Validity, Identification, Psychometrics, Item Response Theory
Peer reviewed Peer reviewed
Cronbach, Lee J. – Journal of Educational Measurement, 1976
The Petersen-Novick paper dealing with culture fair selection (TM 502 259) is the basis for this article. The author proposes a perspective in which ideas can be lined up for comparison and suggests solutions to the problems of selection in employment. (DEP)
Descriptors: Bias, Employment Opportunities, Matrices, Models
Peer reviewed Peer reviewed
Airasian, Peter W.; Bart, William M. – Journal of Educational Measurement, 1975
Validation studies of learning hierarchies usually examine whether task relationships posited a priori are confirmed by student learning data. This method was compared with a non-posited task relationship where all possible task relationships were generated and investigated. A learning hierarchy in a seventh grade mathematics study reported by…
Descriptors: Difficulty Level, Intellectual Development, Junior High Schools, Learning Theories
Peer reviewed Peer reviewed
Williamson, David M.; Bejar, Isaac I.; Hone, Anne S. – Journal of Educational Measurement, 1999
Contrasts "mental models" used by automated scoring for the simulation division of the computerized Architect Registration Examination with those used by experienced human graders for 3,613 candidate solutions. Discusses differences in the models used and the potential of automated scoring to enhance the validity evidence of scores. (SLD)
Descriptors: Architects, Comparative Analysis, Computer Assisted Testing, Judges
Peer reviewed Peer reviewed
Direct linkDirect link
Wise, Steven L.; DeMars, Christine E. – Journal of Educational Measurement, 2006
The validity of inferences based on achievement test scores is dependent on the amount of effort that examinees put forth while taking the test. With low-stakes tests, for which this problem is particularly prevalent, there is a consequent need for psychometric models that can take into account differing levels of examinee effort. This article…
Descriptors: Guessing (Tests), Psychometrics, Inferences, Reaction Time
Peer reviewed Peer reviewed
Linn, Robert L. – Journal of Educational Measurement, 1984
The common approach to studies of predictive bias is analyzed within the context of a conceptual model in which predictors and criterion measures are viewed as fallible indicators of idealized qualifications. (Author/PN)
Descriptors: Certification, Models, Predictive Measurement, Predictive Validity
Peer reviewed Peer reviewed
Wardrop, James L.; And Others – Journal of Educational Measurement, 1982
A structure for describing different approaches to testing is generated by identifying five dimensions along which tests differ: test uses, item generation, item revision, assessment of precision, and validation. These dimensions are used to profile tests of reading comprehension. Only norm-referenced achievement tests had an inference system…
Descriptors: Achievement Tests, Comparative Analysis, Educational Testing, Models
Previous Page | Next Page »
Pages: 1  |  2