NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 12 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Jang, Yoona; Hong, Sehee – Educational and Psychological Measurement, 2023
The purpose of this study was to evaluate the degree of classification quality in the basic latent class model when covariates are either included or are not included in the model. To accomplish this task, Monte Carlo simulations were conducted in which the results of models with and without a covariate were compared. Based on these simulations,…
Descriptors: Classification, Models, Prediction, Sample Size
Peer reviewed Peer reviewed
Direct linkDirect link
Raykov, Tenko; Marcoulides, George A.; Li, Tenglong – Educational and Psychological Measurement, 2016
A method for evaluating the validity of multicomponent measurement instruments in heterogeneous populations is discussed. The procedure can be used for point and interval estimation of criterion validity of linear composites in populations representing mixtures of an unknown number of latent classes. The approach permits also the evaluation of…
Descriptors: Validity, Measures (Individuals), Classification, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Yang, Yanyun; Xia, Yan – Educational and Psychological Measurement, 2019
When item scores are ordered categorical, categorical omega can be computed based on the parameter estimates from a factor analysis model using frequentist estimators such as diagonally weighted least squares. When the sample size is relatively small and thresholds are different across items, using diagonally weighted least squares can yield a…
Descriptors: Scores, Sample Size, Bayesian Statistics, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Cousineau, Denis; Laurencelle, Louis – Educational and Psychological Measurement, 2017
Assessing global interrater agreement is difficult as most published indices are affected by the presence of mixtures of agreements and disagreements. A previously proposed method was shown to be specifically sensitive to global agreement, excluding mixtures, but also negatively biased. Here, we propose two alternatives in an attempt to find what…
Descriptors: Interrater Reliability, Evaluation Methods, Statistical Bias, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Lamprianou, Iasonas – Educational and Psychological Measurement, 2018
It is common practice for assessment programs to organize qualifying sessions during which the raters (often known as "markers" or "judges") demonstrate their consistency before operational rating commences. Because of the high-stakes nature of many rating activities, the research community tends to continuously explore new…
Descriptors: Social Networks, Network Analysis, Comparative Analysis, Innovation
Peer reviewed Peer reviewed
Direct linkDirect link
Andrich, David – Educational and Psychological Measurement, 2013
Assessments in response formats with ordered categories are ubiquitous in the social and health sciences. Although the assumption that the ordering of the categories is working as intended is central to any interpretation that arises from such assessments, testing that this assumption is valid is not standard in psychometrics. This is surprising…
Descriptors: Item Response Theory, Classification, Statistical Analysis, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Jiao, Hong; Liu, Junhui; Haynie, Kathleen; Woo, Ada; Gorham, Jerry – Educational and Psychological Measurement, 2012
This study explored the impact of partial credit scoring of one type of innovative items (multiple-response items) in a computerized adaptive version of a large-scale licensure pretest and operational test settings. The impacts of partial credit scoring on the estimation of the ability parameters and classification decisions in operational test…
Descriptors: Test Items, Computer Assisted Testing, Measures (Individuals), Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Gnambs, Timo; Batinic, Bernad – Educational and Psychological Measurement, 2011
Computer-adaptive classification tests focus on classifying respondents in different proficiency groups (e.g., for pass/fail decisions). To date, adaptive classification testing has been dominated by research on dichotomous response formats and classifications in two groups. This article extends this line of research to polytomous classification…
Descriptors: Test Length, Computer Assisted Testing, Classification, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Holden, Jocelyn E.; Kelley, Ken – Educational and Psychological Measurement, 2010
Classification procedures are common and useful in behavioral, educational, social, and managerial research. Supervised classification techniques such as discriminant function analysis assume training data are perfectly classified when estimating parameters or classifying. In contrast, unsupervised classification techniques such as finite mixture…
Descriptors: Discriminant Analysis, Classification, Computation, Behavioral Science Research
Peer reviewed Peer reviewed
Direct linkDirect link
Choi, Namok; Fuqua, Dale R.; Newman, Jody L. – Educational and Psychological Measurement, 2008
Pedhazur and Tetenbaum speculated that factor structures from self-ratings of the Bem Sex-Role Inventory (BSRI) personality traits would be different from factor structures from desirability ratings of the same traits. To explore this hypothesis, both desirability ratings of BSRI traits (both "for a man" and "for a woman") and…
Descriptors: Personality Traits, Sex Role, Gender Discrimination, Self Evaluation (Individuals)
Peer reviewed Peer reviewed
Schriesheim, Chester A.; And Others – Educational and Psychological Measurement, 1989
Three studies explored the effects of grouping versus randomized items in questionnaires on internal consistency and test-retest reliability with samples of 80, 80, and 100, respectively, university students and undergraduates. The 2 correlational and 1 experimental studies were reasonably consistent in demonstrating that neither format was…
Descriptors: Classification, College Students, Evaluation Methods, Higher Education
Peer reviewed Peer reviewed
And Others; Birenbaum, Menucha – Educational and Psychological Measurement, 1997
The agreement of diagnostic classifications from two parallel subtests assessing a mathematics skill with three levels of scoring was studied with 431 Arab Israeli 10th graders. Results indicate that, even when parallel form reliability is high, less agreement is apparent when performance is evaluated at the micro level. (SLD)
Descriptors: Arabs, Classification, Diagnostic Tests, Evaluation Methods