ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	18

Descriptor

Classification	19
Foreign Countries	7
Models	7
Reliability	6
Test Items	6
Psychometrics	5
Scores	5
Mathematics Tests	4
Statistical Analysis	4
Test Construction	4
Accuracy	3
Computer Software	3
Correlation	3
Goodness of Fit	3
Measurement	3
Regression (Statistics)	3
Student Characteristics	3
Test Bias	3
Computer Assisted Testing	2
Construct Validity	2
Cross Cultural Studies	2
Difficulty Level	2
Educational Assessment	2
Elementary School Students	2
Equated Scores	2
More ▼

Source

International Journal of…

Publication Type

Journal Articles	19
Reports - Research	9
Reports - Evaluative	5
Reports - Descriptive	4
Guides - Non-Classroom	1
Information Analyses	1

Education Level

Elementary Education	2
Grade 4	2
Higher Education	2
Postsecondary Education	2
Elementary Secondary Education	1
Grade 12	1
Grade 8	1
High Schools	1
Intermediate Grades	1
Secondary Education	1

Audience

Practitioners	1
Researchers	1

Location

United States	2
Australia	1
Belgium	1
Canada	1
China	1
France	1
Germany	1
Iowa	1
Israel	1
Malawi	1
Philippines	1
Taiwan	1
Tunisia	1
Zimbabwe	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Progress in International…	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 19 results Save | Export

Identification and Validation of Severity Standards for the Academic Anxiety Scale

Peer reviewed

Direct link

W. Holmes Finch; Jerrell C. Cassady; C. Addison Helsper – International Journal of Testing, 2024

The Academic Anxiety Scale (AAS; Cassady, 2022; Cassady et al., 2019) is a measure of the construct academic anxiety, which is a generalized representation of anxieties experienced by learners in educational settings. Academic anxiety has been identified as a preclinical indicator of anxiety that provides important predictive utility to clinical…

Descriptors: Validity, Anxiety, Academic Achievement, Behavior Rating Scales

Diagnostic Classification Models: Recent Developments, Practical Issues, and Prospects

Peer reviewed

Direct link

Ravand, Hamdollah; Baghaei, Purya – International Journal of Testing, 2020

More than three decades after their introduction, diagnostic classification models (DCM) do not seem to have been implemented in educational systems for the purposes they were devised. Most DCM research is either methodological for model development and refinement or retrofitting to existing nondiagnostic tests and, in the latter case, basically…

Descriptors: Classification, Models, Diagnostic Tests, Test Construction

Item Parameter Drift in Computer Adaptive Testing Due to Lack of Content Knowledge

Peer reviewed

Direct link

Aksu Dunya, Beyza – International Journal of Testing, 2018

This study was conducted to analyze potential item parameter drift (IPD) impact on person ability estimates and classification accuracy when drift affects an examinee subgroup. Using a series of simulations, three factors were manipulated: (a) percentage of IPD items in the CAT exam, (b) percentage of examinees affected by IPD, and (c) item pool…

Descriptors: Adaptive Testing, Classification, Accuracy, Computer Assisted Testing

Invariance Properties for General Diagnostic Classification Models

Peer reviewed

Direct link

Bradshaw, Laine P.; Madison, Matthew J. – International Journal of Testing, 2016

In item response theory (IRT), the invariance property states that item parameter estimates are independent of the examinee sample, and examinee ability estimates are independent of the test items. While this property has long been established and understood by the measurement community for IRT models, the same cannot be said for diagnostic…

Descriptors: Classification, Models, Simulation, Psychometrics

Challenges to the Use of Artificial Neural Networks for Diagnostic Classifications with Student Test Data

Peer reviewed

Direct link

Briggs, Derek C.; Circi, Ruhan – International Journal of Testing, 2017

Artificial Neural Networks (ANNs) have been proposed as a promising approach for the classification of students into different levels of a psychological attribute hierarchy. Unfortunately, because such classifications typically rely upon internally produced item response patterns that have not been externally validated, the instability of ANN…

Descriptors: Artificial Intelligence, Classification, Student Evaluation, Tests

Incremental Validity of Multidimensional Proficiency Scores from Diagnostic Classification Models: An Illustration for Elementary School Mathematics

Peer reviewed

Direct link

Kunina-Habenicht, Olga; Rupp, André A.; Wilhelm, Oliver – International Journal of Testing, 2017

Diagnostic classification models (DCMs) hold great potential for applications in summative and formative assessment by providing discrete multivariate proficiency scores that yield statistically driven classifications of students. Using data from a newly developed diagnostic arithmetic assessment that was administered to 2032 fourth-grade students…

Descriptors: Grade 4, Foreign Countries, Classification, Mathematics Tests

Fitting the Reduced RUM with Mplus: A Tutorial

Peer reviewed

Direct link

Chiu, Chia-Yi; Köhn, Hans-Friedrich; Wu, Huey-Min – International Journal of Testing, 2016

The Reduced Reparameterized Unified Model (Reduced RUM) is a diagnostic classification model for educational assessment that has received considerable attention among psychometricians. However, the computational options for researchers and practitioners who wish to use the Reduced RUM in their work, but do not feel comfortable writing their own…

Descriptors: Educational Diagnosis, Classification, Models, Educational Assessment

An Illustration of Diagnostic Classification Modeling in Student Learning Outcomes Assessment

Peer reviewed

Direct link

Jurich, Daniel P.; Bradshaw, Laine P. – International Journal of Testing, 2014

The assessment of higher-education student learning outcomes is an important component in understanding the strengths and weaknesses of academic and general education programs. This study illustrates the application of diagnostic classification models, a burgeoning set of statistical models, in assessing student learning outcomes. To facilitate…

Descriptors: College Outcomes Assessment, Classification, Statistical Analysis, Models

Determining When Single Scoring for Constructed-Response Items Is as Effective as Double Scoring in Mixed-Format Licensure Tests

Peer reviewed

Direct link

Kim, Sooyeon; Moses, Tim – International Journal of Testing, 2013

The major purpose of this study is to assess the conditions under which single scoring for constructed-response (CR) items is as effective as double scoring in the licensure testing context. We used both empirical datasets of five mixed-format licensure tests collected in actual operational settings and simulated datasets that allowed for the…

Descriptors: Scoring, Test Format, Licensing Examinations (Professions), Test Items

Recursive Partitioning to Identify Potential Causes of Differential Item Functioning in Cross-National Data

Peer reviewed

Direct link

Finch, W. Holmes; Hernández Finch, Maria E.; French, Brian F. – International Journal of Testing, 2016

Differential item functioning (DIF) assessment is key in score validation. When DIF is present scores may not accurately reflect the construct of interest for some groups of examinees, leading to incorrect conclusions from the scores. Given rising immigration, and the increased reliance of educational policymakers on cross-national assessments…

Descriptors: Test Bias, Scores, Native Language, Language Usage

Enhancing the Interpretability of the Overall Results of an International Test of English-Language Proficiency

Peer reviewed

Direct link

Papageorgiou, Spiros; Morgan, Rick; Becker, Valerie – International Journal of Testing, 2015

The purpose of this study was to enhance the meaning of the scores of an English-language test by developing performance levels and descriptors for reporting overall test performance. The levels and descriptors were intended to accompany the total scale scores of TOEFL Junior® Standard, an international test of English as a second/foreign…

Descriptors: Language Proficiency, Language Tests, English (Second Language), Second Language Learning

The Role of Item Models in Automatic Item Generation

Peer reviewed

Direct link

Gierl, Mark J.; Lai, Hollis – International Journal of Testing, 2012

Automatic item generation represents a relatively new but rapidly evolving research area where cognitive and psychometric theories are used to produce tests that include items generated using computer technology. Automatic item generation requires two steps. First, test development specialists create item models, which are comparable to templates…

Descriptors: Foreign Countries, Psychometrics, Test Construction, Test Items

Importance of Equating High-Stakes Educational Measurements

Peer reviewed

Direct link

Chulu, Bob Wajizigha; Sireci, Stephen G. – International Journal of Testing, 2011

Many examination agencies, policy makers, media houses, and the public at large make high-stakes decisions based on test scores. Unfortunately, in some cases educational tests are not statistically equated to account for test differences over time, which leads to inappropriate interpretations of students' performance. In this study we illustrate…

Descriptors: Classification, Foreign Countries, Item Response Theory, High Stakes Tests

Correcting Fallacies in Validity, Reliability, and Classification

Peer reviewed

Direct link

Sijtsma, Klaas – International Journal of Testing, 2009

This article reviews three topics from test theory that continue to raise discussion and controversy and capture test theorists' and constructors' interest. The first topic concerns the discussion of the methodology of investigating and establishing construct validity; the second topic concerns reliability and its misuse, alternative definitions…

Descriptors: Construct Validity, Reliability, Classification, Test Theory

Ranking Groups' Abilities: Is It Always Reliable?

Peer reviewed

Direct link

Schechtman, Edna; Yitzhaki, Shlomo – International Journal of Testing, 2009

The huge technological improvement in data processing and the globalization have increased the demand for and the supply of indices that quantify the consequences of a policy. However, there are certain cases in which quantification may be misleading in the sense that it gives the impression of an accurate measurement while in reality it is not.…

Descriptors: Ability, Measurement, Classification, Students

Previous Page | Next Page »

Pages: 1 | 2

Bradshaw, Laine P.	2
Aksu Dunya, Beyza	1
Baghaei, Purya	1
Becker, Valerie	1
Briggs, Derek C.	1
C. Addison Helsper	1
Charnas, Jocelyn W.	1
Chen, Yi-Hsin	1
Chiu, Chia-Yi	1
Chulu, Bob Wajizigha	1
Circi, Ruhan	1
DeFife, Jared A.	1
Elosua, Paula	1
Eudell-Simmons, Erin M.	1
Finch, W. Holmes	1
French, Brian F.	1
Gierl, Mark J.	1
Glasgow, Ken	1
Gorin, Joanna S.	1
Hernández Finch, Maria E.	1
Hilsenroth, Mark J.	1
Jerrell C. Cassady	1
Jumel, Bernard	1
Jurich, Daniel P.	1
Kim, Sooyeon	1
More ▼