Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 18 |
Descriptor
Classification | 22 |
Psychometrics | 22 |
Reliability | 22 |
Validity | 8 |
Correlation | 7 |
Item Response Theory | 7 |
Foreign Countries | 5 |
Scores | 5 |
Factor Analysis | 4 |
Models | 4 |
Comparative Analysis | 3 |
More ▼ |
Source
Author
Barone, Lavinia | 1 |
Becker, Valerie | 1 |
Bors, Douglas A. | 1 |
Bradshaw, Laine P. | 1 |
Bramley, Tom | 1 |
Brennan, Robert L. | 1 |
Chen, Fu | 1 |
Choi, Jiwon | 1 |
Colle, Livia | 1 |
Del Giudice, Marco | 1 |
Fodstad, Jill C. | 1 |
More ▼ |
Publication Type
Journal Articles | 20 |
Reports - Research | 12 |
Reports - Evaluative | 7 |
Reports - Descriptive | 2 |
Dissertations/Theses -… | 1 |
Opinion Papers | 1 |
Speeches/Meeting Papers | 1 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 6 |
Postsecondary Education | 4 |
High Schools | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Childhood Autism Rating Scale | 1 |
Raven Progressive Matrices | 1 |
Test of English as a Foreign… | 1 |
Work Keys (ACT) | 1 |
What Works Clearinghouse Rating
Ramsey Lee Cardwell – ProQuest LLC, 2022
The emergence of digital-first assessments is prompting reconsideration of, and innovation in, aspects of psychometrics, test validation, and test use. Using the Duolingo English Test (DET) as an example, this three-paper series seeks to address issues concerning the estimation of classification consistency and the reporting of results for such…
Descriptors: Classification, Reliability, Language Proficiency, Computer Assisted Testing
Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020
This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…
Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests
Madison, Matthew J.; Bradshaw, Laine P. – Educational and Psychological Measurement, 2015
Diagnostic classification models are psychometric models that aim to classify examinees according to their mastery or non-mastery of specified latent characteristics. These models are well-suited for providing diagnostic feedback on educational assessments because of their practical efficiency and increased reliability when compared with other…
Descriptors: Classification, Accuracy, Models, Psychometrics
Mirmoghtadaie, Zohresadat; Keshavarz, Mohsen; Mohammadimehr, Mojgan; Rasouli, Davood – International Review of Research in Open and Distributed Learning, 2023
In peer observation of teaching, an experienced colleague in the educational environment of a faculty member observes the educational performance of that faculty member and provides appropriate feedback. The use of peer review as an alternative source of evidence of teaching effectiveness is increasing. However, no research has been done in the…
Descriptors: Learning Management Systems, Academic Achievement, Peer Evaluation, Teacher Evaluation
Chen, Fu; Zhang, Shanshan; Guo, Yanfang; Xin, Tao – Research in Science Education, 2017
We used the Rule Space Model, a cognitive diagnostic model, to measure the learning progression for thermochemistry for senior high school students. We extracted five attributes and proposed their hierarchical relationships to model the construct of thermochemistry at four levels using a hypothesized learning progression. For this study, we…
Descriptors: Chemistry, High School Students, Secondary School Science, Correlation
Lee, Jihyun; Paek, Insu – Journal of Psychoeducational Assessment, 2014
Likert-type rating scales are still the most widely used method when measuring psychoeducational constructs. The present study investigates a long-standing issue of identifying the optimal number of response categories. A special emphasis is given to categorical data, which were generated by the Item Response Theory (IRT) Graded-Response Modeling…
Descriptors: Likert Scales, Responses, Item Response Theory, Classification
Pawlak, Miroslaw – Studies in Second Language Learning and Teaching, 2018
Despite all the progress that has been made in research on language learning strategies since the publication of Rubin's (1975) seminal paper on good language learners, there are areas that have been neglected by strategy experts. Perhaps the most blatant manifestation of this neglect is the paucity of research into grammar learning strategies…
Descriptors: Grammar, Learning Strategies, Teaching Methods, Second Language Learning
Kim, Sooyeon; Moses, Tim – International Journal of Testing, 2013
The major purpose of this study is to assess the conditions under which single scoring for constructed-response (CR) items is as effective as double scoring in the licensure testing context. We used both empirical datasets of five mixed-format licensure tests collected in actual operational settings and simulated datasets that allowed for the…
Descriptors: Scoring, Test Format, Licensing Examinations (Professions), Test Items
Li, Li; Titsworth, Scott – American Journal of Distance Education, 2015
The current program of research included two studies that developed the Student Online Misbehaviors (SOMs) scale and explored relationships between the SOMs and various classroom communication processes and outcomes. The first study inductively developed initial SOM typologies and tested factor structure via an exploratory factor analysis.…
Descriptors: Student Behavior, Behavior Problems, Online Courses, Electronic Learning
Papageorgiou, Spiros; Morgan, Rick; Becker, Valerie – International Journal of Testing, 2015
The purpose of this study was to enhance the meaning of the scores of an English-language test by developing performance levels and descriptors for reporting overall test performance. The levels and descriptors were intended to accompany the total scale scores of TOEFL Junior® Standard, an international test of English as a second/foreign…
Descriptors: Language Proficiency, Language Tests, English (Second Language), Second Language Learning
Iacobucci, Dawn – Journal of Marketing Education, 2013
This research investigates the reliability and validity of three major publications' rankings of MBA programs. Each set of rankings showed reasonable consistency over time, both at the level of the overall rankings and for most of the facets from which the rankings are derived. Each set of rankings also showed some levels of convergent and…
Descriptors: Psychometrics, Business Administration Education, Reliability, Validity
Matson, Johnny L.; Mahan, Sara; Hess, Julie A.; Fodstad, Jill C.; Neal, Daniene – Research in Autism Spectrum Disorders, 2010
Previous studies analyzed the reliability as well as sensitivity and specificity of the Autism Spectrum Disorder-Diagnostic for Children (ASD-DC). This study further examines the psychometric properties of the ASD-DC by assessing whether the ASD-DC has convergent validity against a psychometrically sound observational instrument for Autistic…
Descriptors: Verbal Communication, Nonverbal Communication, Autism, Validity
Barone, Lavinia; Del Giudice, Marco; Fossati, Andrea; Manaresi, Francesca; Perinetti, Barbara Actis; Colle, Livia; Veglia, Fabio – International Journal of Behavioral Development, 2009
The paper describes a multicentre study of the psychometric properties of the Manchester Child Attachment Story Task in a sample of 230 Italian children aged 4 to 8 years. The task's internal consistency and inter-rater reliability were investigated; in addition, multiple discriminant analysis was used to explore the contribution of individual…
Descriptors: Measures (Individuals), Young Children, Attachment Behavior, Reliability
Bramley, Tom – Educational Research, 2010
Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…
Descriptors: National Curriculum, Educational Research, Testing, Measurement
Vigneau, Francois; Bors, Douglas A. – Intelligence, 2008
Various taxonomies of Raven's Advanced Progressive Matrices (APM) items have been proposed in the literature to account for performance on the test. In the present article, three such taxonomies based on information processing, namely Carpenter, Just and Shell's [Carpenter, P.A., Just, M.A., & Shell, P., (1990). What one intelligence test…
Descriptors: Intelligence, Intelligence Tests, Factor Analysis, Classification
Previous Page | Next Page »
Pages: 1 | 2