ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	18

Descriptor

Classification	22
Psychometrics	22
Reliability	22
Validity	8
Correlation	7
Item Response Theory	7
Foreign Countries	5
Scores	5
Factor Analysis	4
Models	4
Comparative Analysis	3
Electronic Learning	3
English (Second Language)	3
Error of Measurement	3
Evaluation Methods	3
Measurement	3
Measures (Individuals)	3
Test Format	3
Test Items	3
Accuracy	2
College Students	2
Cutting Scores	2
Evaluation	2
Evaluation Criteria	2
Feedback (Response)	2
More ▼

Publication Type

Journal Articles	20
Reports - Research	12
Reports - Evaluative	7
Reports - Descriptive	2
Dissertations/Theses -…	1
Opinion Papers	1
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Higher Education	6
Postsecondary Education	4
High Schools	1

Audience

Location

Florida	1
Iran (Tehran)	1
Italy	1
Netherlands	1
Poland	1
United Kingdom (England)	1

Laws, Policies, & Programs

Assessments and Surveys

Childhood Autism Rating Scale	1
Raven Progressive Matrices	1
Test of English as a Foreign…	1
Work Keys (ACT)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 22 results Save | Export

Classification Consistency and Results Reporting of a Digital-First Computer-Adaptive Language Proficiency Test

Direct link

Ramsey Lee Cardwell – ProQuest LLC, 2022

The emergence of digital-first assessments is prompting reconsideration of, and innovation in, aspects of psychometrics, test validation, and test use. Using the Duolingo English Test (DET) as an example, this three-paper series seeks to address issues concerning the estimation of classification consistency and the reporting of results for such…

Descriptors: Classification, Reliability, Language Proficiency, Computer Assisted Testing

IRT Approaches to Modeling Scores on Mixed-Format Tests

Peer reviewed

Direct link

Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020

This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…

Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests

The Effects of Q-Matrix Design on Classification Accuracy in the Log-Linear Cognitive Diagnosis Model

Peer reviewed

Direct link

Madison, Matthew J.; Bradshaw, Laine P. – Educational and Psychological Measurement, 2015

Diagnostic classification models are psychometric models that aim to classify examinees according to their mastery or non-mastery of specified latent characteristics. These models are well-suited for providing diagnostic feedback on educational assessments because of their practical efficiency and increased reliability when compared with other…

Descriptors: Classification, Accuracy, Models, Psychometrics

The Design and Psychometric Properties of a Peer Observation Tool for Use in LMS-Based Classrooms in Medical Sciences

Peer reviewed
PDF on ERIC

Download full text

Mirmoghtadaie, Zohresadat; Keshavarz, Mohsen; Mohammadimehr, Mojgan; Rasouli, Davood – International Review of Research in Open and Distributed Learning, 2023

In peer observation of teaching, an experienced colleague in the educational environment of a faculty member observes the educational performance of that faculty member and provides appropriate feedback. The use of peer review as an alternative source of evidence of teaching effectiveness is increasing. However, no research has been done in the…

Descriptors: Learning Management Systems, Academic Achievement, Peer Evaluation, Teacher Evaluation

Applying the Rule Space Model to Develop a Learning Progression for Thermochemistry

Peer reviewed

Direct link

Chen, Fu; Zhang, Shanshan; Guo, Yanfang; Xin, Tao – Research in Science Education, 2017

We used the Rule Space Model, a cognitive diagnostic model, to measure the learning progression for thermochemistry for senior high school students. We extracted five attributes and proposed their hierarchical relationships to model the construct of thermochemistry at four levels using a hypothesized learning progression. For this study, we…

Descriptors: Chemistry, High School Students, Secondary School Science, Correlation

In Search of the Optimal Number of Response Categories in a Rating Scale

Peer reviewed

Direct link

Lee, Jihyun; Paek, Insu – Journal of Psychoeducational Assessment, 2014

Likert-type rating scales are still the most widely used method when measuring psychoeducational constructs. The present study investigates a long-standing issue of identifying the optimal number of response categories. A special emphasis is given to categorical data, which were generated by the Item Response Theory (IRT) Graded-Response Modeling…

Descriptors: Likert Scales, Responses, Item Response Theory, Classification

"Grammar Learning Strategy Inventory" (GLSI): Another Look"

Peer reviewed
PDF on ERIC

Download full text

Pawlak, Miroslaw – Studies in Second Language Learning and Teaching, 2018

Despite all the progress that has been made in research on language learning strategies since the publication of Rubin's (1975) seminal paper on good language learners, there are areas that have been neglected by strategy experts. Perhaps the most blatant manifestation of this neglect is the paucity of research into grammar learning strategies…

Descriptors: Grammar, Learning Strategies, Teaching Methods, Second Language Learning

Determining When Single Scoring for Constructed-Response Items Is as Effective as Double Scoring in Mixed-Format Licensure Tests

Peer reviewed

Direct link

Kim, Sooyeon; Moses, Tim – International Journal of Testing, 2013

The major purpose of this study is to assess the conditions under which single scoring for constructed-response (CR) items is as effective as double scoring in the licensure testing context. We used both empirical datasets of five mixed-format licensure tests collected in actual operational settings and simulated datasets that allowed for the…

Descriptors: Scoring, Test Format, Licensing Examinations (Professions), Test Items

Student Misbehaviors in Online Classrooms: Scale Development and Validation

Peer reviewed

Direct link

Li, Li; Titsworth, Scott – American Journal of Distance Education, 2015

The current program of research included two studies that developed the Student Online Misbehaviors (SOMs) scale and explored relationships between the SOMs and various classroom communication processes and outcomes. The first study inductively developed initial SOM typologies and tested factor structure via an exploratory factor analysis.…

Descriptors: Student Behavior, Behavior Problems, Online Courses, Electronic Learning

Enhancing the Interpretability of the Overall Results of an International Test of English-Language Proficiency

Peer reviewed

Direct link

Papageorgiou, Spiros; Morgan, Rick; Becker, Valerie – International Journal of Testing, 2015

The purpose of this study was to enhance the meaning of the scores of an English-language test by developing performance levels and descriptors for reporting overall test performance. The levels and descriptors were intended to accompany the total scale scores of TOEFL Junior® Standard, an international test of English as a second/foreign…

Descriptors: Language Proficiency, Language Tests, English (Second Language), Second Language Learning

A Psychometric Assessment of the "Businessweek," "U.S. News & World Report," and "Financial Times" Rankings of Business Schools' MBA Programs

Peer reviewed

Direct link

Iacobucci, Dawn – Journal of Marketing Education, 2013

This research investigates the reliability and validity of three major publications' rankings of MBA programs. Each set of rankings showed reasonable consistency over time, both at the level of the overall rankings and for most of the facets from which the rankings are derived. Each set of rankings also showed some levels of convergent and…

Descriptors: Psychometrics, Business Administration Education, Reliability, Validity

Convergent Validity of the Autism Spectrum Disorder-Diagnostic for Children (ASD-DC) and Childhood Autism Rating Scales (CARS)

Peer reviewed

Direct link

Matson, Johnny L.; Mahan, Sara; Hess, Julie A.; Fodstad, Jill C.; Neal, Daniene – Research in Autism Spectrum Disorders, 2010

Previous studies analyzed the reliability as well as sensitivity and specificity of the Autism Spectrum Disorder-Diagnostic for Children (ASD-DC). This study further examines the psychometric properties of the ASD-DC by assessing whether the ASD-DC has convergent validity against a psychometrically sound observational instrument for Autistic…

Descriptors: Verbal Communication, Nonverbal Communication, Autism, Validity

Psychometric Properties of the Manchester Child Attachment Story Task: An Italian Multicentre Study

Peer reviewed

Direct link

Barone, Lavinia; Del Giudice, Marco; Fossati, Andrea; Manaresi, Francesca; Perinetti, Barbara Actis; Colle, Livia; Veglia, Fabio – International Journal of Behavioral Development, 2009

The paper describes a multicentre study of the psychometric properties of the Manchester Child Attachment Story Task in a sample of 230 Italian children aged 4 to 8 years. The task's internal consistency and inter-rater reliability were investigated; in addition, multiple discriminant analysis was used to explore the contribution of individual…

Descriptors: Measures (Individuals), Young Children, Attachment Behavior, Reliability

A Response to an Article Published in "Educational Research"'s Special Issue on Assessment (June 2009). What Can Be Inferred about Classification Accuracy from Classification Consistency?

Peer reviewed

Direct link

Bramley, Tom – Educational Research, 2010

Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…

Descriptors: National Curriculum, Educational Research, Testing, Measurement

The Quest for Item Types Based on Information Processing: An Analysis of Raven's Advanced Progressive Matrices, with a Consideration of Gender Differences

Peer reviewed

Direct link

Vigneau, Francois; Bors, Douglas A. – Intelligence, 2008

Various taxonomies of Raven's Advanced Progressive Matrices (APM) items have been proposed in the literature to account for performance on the test. In the present article, three such taxonomies based on information processing, namely Carpenter, Just and Shell's [Carpenter, P.A., Just, M.A., & Shell, P., (1990). What one intelligence test…

Descriptors: Intelligence, Intelligence Tests, Factor Analysis, Classification

Previous Page | Next Page »

Pages: 1 | 2

International Journal of…	2
Journal of Educational…	2
Psychometrika	2
American Journal of Distance…	1
Applied Psychological…	1
Educational Research	1
Educational and Psychological…	1
Intelligence	1
International Journal of…	1
International Review of…	1
Journal of Marketing Education	1
Journal of Psychoeducational…	1
Journal of Speech and Hearing…	1
Measurement:…	1
Online Submission	1
ProQuest LLC	1
Research in Autism Spectrum…	1
Research in Science Education	1
Studies in Second Language…	1
More ▼

Barone, Lavinia	1
Becker, Valerie	1
Bors, Douglas A.	1
Bradshaw, Laine P.	1
Bramley, Tom	1
Brennan, Robert L.	1
Chen, Fu	1
Choi, Jiwon	1
Colle, Livia	1
Del Giudice, Marco	1
Fodstad, Jill C.	1
Fossati, Andrea	1
Guo, Yanfang	1
Haberman, Shelby J.	1
Harris, Deborah J.	1
Hess, Julie A.	1
Iacobucci, Dawn	1
Jefferson, T. R.	1
Kang, Yujin	1
Keshavarz, Mohsen	1
Kim, Seonghoon	1
Kim, Sooyeon	1
Kim, Stella Y.	1
Kolen, Michael J.	1
More ▼