Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 10 |
Descriptor
Classification | 12 |
Evaluation Methods | 12 |
Computation | 3 |
Models | 3 |
Statistical Analysis | 3 |
Test Items | 3 |
Comparative Analysis | 2 |
Computer Assisted Testing | 2 |
Data Analysis | 2 |
Discriminant Analysis | 2 |
Factor Analysis | 2 |
More ▼ |
Source
Educational and Psychological… | 12 |
Author
Andrich, David | 1 |
Batinic, Bernad | 1 |
Birenbaum, Menucha | 1 |
Choi, Namok | 1 |
Cousineau, Denis | 1 |
Fuqua, Dale R. | 1 |
Gnambs, Timo | 1 |
Gorham, Jerry | 1 |
Haynie, Kathleen | 1 |
Holden, Jocelyn E. | 1 |
Hong, Sehee | 1 |
More ▼ |
Publication Type
Journal Articles | 12 |
Reports - Research | 10 |
Reports - Evaluative | 2 |
Education Level
Audience
Location
Israel | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Bem Sex Role Inventory | 1 |
What Works Clearinghouse Rating
Jang, Yoona; Hong, Sehee – Educational and Psychological Measurement, 2023
The purpose of this study was to evaluate the degree of classification quality in the basic latent class model when covariates are either included or are not included in the model. To accomplish this task, Monte Carlo simulations were conducted in which the results of models with and without a covariate were compared. Based on these simulations,…
Descriptors: Classification, Models, Prediction, Sample Size
Raykov, Tenko; Marcoulides, George A.; Li, Tenglong – Educational and Psychological Measurement, 2016
A method for evaluating the validity of multicomponent measurement instruments in heterogeneous populations is discussed. The procedure can be used for point and interval estimation of criterion validity of linear composites in populations representing mixtures of an unknown number of latent classes. The approach permits also the evaluation of…
Descriptors: Validity, Measures (Individuals), Classification, Evaluation Methods
Yang, Yanyun; Xia, Yan – Educational and Psychological Measurement, 2019
When item scores are ordered categorical, categorical omega can be computed based on the parameter estimates from a factor analysis model using frequentist estimators such as diagonally weighted least squares. When the sample size is relatively small and thresholds are different across items, using diagonally weighted least squares can yield a…
Descriptors: Scores, Sample Size, Bayesian Statistics, Item Analysis
Cousineau, Denis; Laurencelle, Louis – Educational and Psychological Measurement, 2017
Assessing global interrater agreement is difficult as most published indices are affected by the presence of mixtures of agreements and disagreements. A previously proposed method was shown to be specifically sensitive to global agreement, excluding mixtures, but also negatively biased. Here, we propose two alternatives in an attempt to find what…
Descriptors: Interrater Reliability, Evaluation Methods, Statistical Bias, Accuracy
Lamprianou, Iasonas – Educational and Psychological Measurement, 2018
It is common practice for assessment programs to organize qualifying sessions during which the raters (often known as "markers" or "judges") demonstrate their consistency before operational rating commences. Because of the high-stakes nature of many rating activities, the research community tends to continuously explore new…
Descriptors: Social Networks, Network Analysis, Comparative Analysis, Innovation
Andrich, David – Educational and Psychological Measurement, 2013
Assessments in response formats with ordered categories are ubiquitous in the social and health sciences. Although the assumption that the ordering of the categories is working as intended is central to any interpretation that arises from such assessments, testing that this assumption is valid is not standard in psychometrics. This is surprising…
Descriptors: Item Response Theory, Classification, Statistical Analysis, Models
Jiao, Hong; Liu, Junhui; Haynie, Kathleen; Woo, Ada; Gorham, Jerry – Educational and Psychological Measurement, 2012
This study explored the impact of partial credit scoring of one type of innovative items (multiple-response items) in a computerized adaptive version of a large-scale licensure pretest and operational test settings. The impacts of partial credit scoring on the estimation of the ability parameters and classification decisions in operational test…
Descriptors: Test Items, Computer Assisted Testing, Measures (Individuals), Scoring
Gnambs, Timo; Batinic, Bernad – Educational and Psychological Measurement, 2011
Computer-adaptive classification tests focus on classifying respondents in different proficiency groups (e.g., for pass/fail decisions). To date, adaptive classification testing has been dominated by research on dichotomous response formats and classifications in two groups. This article extends this line of research to polytomous classification…
Descriptors: Test Length, Computer Assisted Testing, Classification, Test Items
Holden, Jocelyn E.; Kelley, Ken – Educational and Psychological Measurement, 2010
Classification procedures are common and useful in behavioral, educational, social, and managerial research. Supervised classification techniques such as discriminant function analysis assume training data are perfectly classified when estimating parameters or classifying. In contrast, unsupervised classification techniques such as finite mixture…
Descriptors: Discriminant Analysis, Classification, Computation, Behavioral Science Research
Choi, Namok; Fuqua, Dale R.; Newman, Jody L. – Educational and Psychological Measurement, 2008
Pedhazur and Tetenbaum speculated that factor structures from self-ratings of the Bem Sex-Role Inventory (BSRI) personality traits would be different from factor structures from desirability ratings of the same traits. To explore this hypothesis, both desirability ratings of BSRI traits (both "for a man" and "for a woman") and…
Descriptors: Personality Traits, Sex Role, Gender Discrimination, Self Evaluation (Individuals)

Schriesheim, Chester A.; And Others – Educational and Psychological Measurement, 1989
Three studies explored the effects of grouping versus randomized items in questionnaires on internal consistency and test-retest reliability with samples of 80, 80, and 100, respectively, university students and undergraduates. The 2 correlational and 1 experimental studies were reasonably consistent in demonstrating that neither format was…
Descriptors: Classification, College Students, Evaluation Methods, Higher Education

And Others; Birenbaum, Menucha – Educational and Psychological Measurement, 1997
The agreement of diagnostic classifications from two parallel subtests assessing a mathematics skill with three levels of scoring was studied with 431 Arab Israeli 10th graders. Results indicate that, even when parallel form reliability is high, less agreement is apparent when performance is evaluated at the micro level. (SLD)
Descriptors: Arabs, Classification, Diagnostic Tests, Evaluation Methods