ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	7
Since 2006 (last 20 years)	32

Descriptor

Classification	33
Item Response Theory	33
Statistical Analysis	33
Models	15
Simulation	10
Test Items	10
Computation	9
Sample Size	8
Accuracy	7
Comparative Analysis	6
Evaluation Methods	6
Foreign Countries	6
Psychometrics	6
Diagnostic Tests	5
Goodness of Fit	5
Test Length	5
Data Analysis	4
Language Tests	4
Scores	4
Test Bias	4
Bayesian Statistics	3
College Students	3
Computer Assisted Testing	3
Computer Software	3
Correlation	3
More ▼

Source

Educational and Psychological…	8
Applied Psychological…	3
ETS Research Report Series	3
Applied Measurement in…	2
International Educational…	2
Journal of Educational…	2
Language Testing	2
ProQuest LLC	2
College Board	1
Educational Research and…	1
International Journal of…	1
Journal of Experimental…	1
Language Learning in Higher…	1
Measurement:…	1
Online Submission	1
Practical Assessment,…	1
Structural Equation Modeling:…	1
More ▼

Publication Type

Journal Articles	27
Reports - Research	22
Reports - Evaluative	6
Collected Works - Proceedings	2
Dissertations/Theses -…	2
Non-Print Media	1
Reference Materials - General	1
Speeches/Meeting Papers	1

Education Level

Higher Education	8
Postsecondary Education	7
Secondary Education	3
Junior High Schools	2
Middle Schools	2
Elementary Education	1
Grade 6	1
High Schools	1
Intermediate Grades	1

Audience

Location

Afghanistan	1
Belgium	1
China	1
Finland	1
Florida	1
France	1
Germany	1
Illinois (Chicago)	1
Taiwan	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	2
Program for International…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 33 results Save | Export

Implementing a Standardized Effect Size in the POLYSIBTEST Procedure

Peer reviewed

Direct link

Weese, James D.; Turner, Ronna C.; Liang, Xinya; Ames, Allison; Crawford, Brandon – Educational and Psychological Measurement, 2023

A study was conducted to implement the use of a standardized effect size and corresponding classification guidelines for polytomous data with the POLYSIBTEST procedure and compare those guidelines with prior recommendations. Two simulation studies were included. The first identifies new unstandardized test heuristics for classifying moderate and…

Descriptors: Effect Size, Classification, Guidelines, Statistical Analysis

Comparing Drift Detection Methods for Accurate Rasch Equating in Different Sample Sizes

Peer reviewed

Direct link

Alahmadi, Sarah; Jones, Andrew T.; Barry, Carol L.; Ibáñez, Beatriz – Applied Measurement in Education, 2023

Rasch common-item equating is often used in high-stakes testing to maintain equivalent passing standards across test administrations. If unaddressed, item parameter drift poses a major threat to the accuracy of Rasch common-item equating. We compared the performance of well-established and newly developed drift detection methods in small and large…

Descriptors: Equated Scores, Item Response Theory, Sample Size, Test Items

Determining Item Screening Criteria Using Cost-Benefit Analysis

Peer reviewed
PDF on ERIC

Download full text

Bashkov, Bozhidar M.; Clauser, Jerome C. – Practical Assessment, Research & Evaluation, 2019

Successful testing programs rely on high-quality test items to produce reliable scores and defensible exams. However, determining what statistical screening criteria are most appropriate to support these goals can be daunting. This study describes and demonstrates cost-benefit analysis as an empirical approach to determining appropriate screening…

Descriptors: Test Items, Test Reliability, Evaluation Criteria, Accuracy

High-Performance Psychometrics: The Parallel-E Parallel-M Algorithm for Generalized Latent Variable Models. Research Report. ETS RR-16-34

Peer reviewed
PDF on ERIC

Download full text

von Davier, Matthias – ETS Research Report Series, 2016

This report presents results on a parallel implementation of the expectation-maximization (EM) algorithm for multidimensional latent variable models. The developments presented here are based on code that parallelizes both the E step and the M step of the parallel-E parallel-M algorithm. Examples presented in this report include item response…

Descriptors: Psychometrics, Mathematics, Models, Statistical Analysis

Differential Item Functioning Detection across Two Methods of Defining Group Comparisons: Pairwise and Composite Group Comparisons

Peer reviewed

Direct link

Sari, Halil Ibrahim; Huggins, Anne Corinne – Educational and Psychological Measurement, 2015

This study compares two methods of defining groups for the detection of differential item functioning (DIF): (a) pairwise comparisons and (b) composite group comparisons. We aim to emphasize and empirically support the notion that the choice of pairwise versus composite group definitions in DIF is a reflection of how one defines fairness in DIF…

Descriptors: Test Bias, Comparative Analysis, Statistical Analysis, College Entrance Examinations

Effect Size Measures for Differential Item Functioning in a Multidimensional IRT Model

Peer reviewed

Direct link

Suh, Youngsuk – Journal of Educational Measurement, 2016

This study adapted an effect size measure used for studying differential item functioning (DIF) in unidimensional tests and extended the measure to multidimensional tests. Two effect size measures were considered in a multidimensional item response theory model: signed weighted P-difference and unsigned weighted P-difference. The performance of…

Descriptors: Effect Size, Goodness of Fit, Statistical Analysis, Statistical Significance

Setting Cut Scores on an EFL Placement Test Using the Prototype Group Method: A Receiver Operating Characteristic (ROC) Analysis

Peer reviewed

Direct link

Eckes, Thomas – Language Testing, 2017

This paper presents an approach to standard setting that combines the prototype group method (PGM; Eckes, 2012) with a receiver operating characteristic (ROC) analysis. The combined PGM-ROC approach is applied to setting cut scores on a placement test of English as a foreign language (EFL). To implement the PGM, experts first named learners whom…

Descriptors: English (Second Language), Language Tests, Cutting Scores, Standard Setting (Scoring)

Enhancing a Short Measure of Big Five Personality Traits with Bayesian Scaling

Peer reviewed

Direct link

Jones, W. Paul – Educational and Psychological Measurement, 2014

A study in a university clinic/laboratory investigated adaptive Bayesian scaling as a supplement to interpretation of scores on the Mini-IPIP. A "probability of belonging" in categories of low, medium, or high on each of the Big Five traits was calculated after each item response and continued until all items had been used or until a…

Descriptors: Personality Traits, Personality Measures, Bayesian Statistics, Clinics

The Legacies of R. A. Fisher and K. Pearson in the Application of the Polytomous Rasch Model for Assessing the Empirical Ordering of Categories

Peer reviewed

Direct link

Andrich, David – Educational and Psychological Measurement, 2013

Assessments in response formats with ordered categories are ubiquitous in the social and health sciences. Although the assumption that the ordering of the categories is working as intended is central to any interpretation that arises from such assessments, testing that this assumption is valid is not standard in psychometrics. This is surprising…

Descriptors: Item Response Theory, Classification, Statistical Analysis, Models

Statistical Refinement of the Q-Matrix in Cognitive Diagnosis

Peer reviewed

Direct link

Chiu, Chia-Yi – Applied Psychological Measurement, 2013

Most methods for fitting cognitive diagnosis models to educational test data and assigning examinees to proficiency classes require the Q-matrix that associates each item in a test with the cognitive skills (attributes) needed to answer it correctly. In most cases, the Q-matrix is not known but is constructed from the (fallible) judgments of…

Descriptors: Cognitive Tests, Diagnostic Tests, Models, Statistical Analysis

Assessing Dimensionality of Noncompensatory Multidimensional Item Response Theory with Complex Structures

Peer reviewed

Direct link

Svetina, Dubravka – Educational and Psychological Measurement, 2013

The purpose of this study was to investigate the effect of complex structure on dimensionality assessment in noncompensatory multidimensional item response models using dimensionality assessment procedures based on DETECT (dimensionality evaluation to enumerate contributing traits) and NOHARM (normal ogive harmonic analysis robust method). Five…

Descriptors: Item Response Theory, Statistical Analysis, Computation, Test Length

On the Use of Covariates in a Latent Class Signal Detection Model, with Applications to Constructed Response Scoring

Direct link

Wang, Zijian Gerald – ProQuest LLC, 2012

A latent class signal detection (SDT) model was recently introduced as an alternative to traditional item response theory (IRT) methods in the analysis of constructed response data. This class of models can be represented as restricted latent class models and differ from the IRT approach in the way the latent construct is conceptualized. One…

Descriptors: Item Response Theory, Statistical Analysis, Models, Test Items

Rasch Scale Stability in the Presence of Item Parameter and Trait Drift

Peer reviewed

Direct link

Babcock, Ben; Albano, Anthony D. – Applied Psychological Measurement, 2012

Testing programs often rely on common-item equating to maintain a single measurement scale across multiple test administrations and multiple years. Changes over time, in the item parameters and the latent trait underlying the scale, can lead to inaccurate score comparisons and misclassifications of examinees. This study examined how instability in…

Descriptors: Test Items, Measurement, Item Response Theory, Predictor Variables

Two Studies of Specification Error in Models for Categorical Latent Variables

Peer reviewed

Direct link

Kaplan, David; Depaoli, Sarah – Structural Equation Modeling: A Multidisciplinary Journal, 2011

This article examines the problem of specification error in 2 models for categorical latent variables; the latent class model and the latent Markov model. Specification error in the latent class model focuses on the impact of incorrectly specifying the number of latent classes of the categorical latent variable on measures of model adequacy as…

Descriptors: Markov Processes, Longitudinal Studies, Probability, Item Response Theory

Towards the Measurement of EFL Listening Beliefs with Item Response Theory Methods

Peer reviewed

Direct link

Nix, John-Michael L.; Tseng, Wen-Ta – International Journal of Listening, 2014

The present research aims to identify the underlying English listening belief structure of English-as-a-foreign-language (EFL) learners, thereby informing methodologies for subsequent analysis of beliefs with respect to listening achievement. Development of a measurement model of English listening learning beliefs entailed the creation of an…

Descriptors: Item Response Theory, English (Second Language), Second Language Learning, Listening Skills

Previous Page | Next Page »

Pages: 1 | 2 | 3

Rupp, Andre A.	3
von Davier, Matthias	2
Alahmadi, Sarah	1
Albano, Anthony D.	1
Ames, Allison	1
Andrich, David	1
Babcock, Ben	1
Barnes, Tiffany, Ed.	1
Barry, Carol L.	1
Bashkov, Bozhidar M.	1
Cai, Yuyang	1
Chi, Min, Ed.	1
Chiu, Chia-Yi	1
Clauser, Jerome C.	1
Crawford, Brandon	1
De Deyne, Simon	1
Deng, Nina	1
Deng, Weiling	1
Depaoli, Sarah	1
Dorans, Neil J.	1
Dry, Matthew J.	1
Eckes, Thomas	1
Feng, Mingyu, Ed.	1
Finkelman, Matthew David	1
Hong, Yuan	1
More ▼