ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	23

Descriptor

Error of Measurement	24
Item Response Theory	24
Measurement	24
Computation	9
Test Items	9
Comparative Analysis	6
Models	6
Simulation	6
Foreign Countries	5
Psychometrics	5
Difficulty Level	4
Science Tests	4
Statistical Analysis	4
Evaluation Methods	3
Regression (Statistics)	3
Sampling	3
Scores	3
Achievement Tests	2
Data Analysis	2
Educational Research	2
Elementary Secondary Education	2
Equated Scores	2
Evaluation Research	2
Goodness of Fit	2
Grade 8	2
More ▼

Publication Type

Journal Articles	19
Reports - Research	11
Reports - Evaluative	9
Reports - Descriptive	2
Dissertations/Theses -…	1
Reports - General	1

Education Level

Elementary Secondary Education	6
Higher Education	3
Grade 8	2
Junior High Schools	2
Postsecondary Education	2
Secondary Education	2
Adult Education	1
Grade 10	1
Middle Schools	1

Audience

Researchers

Location

Australia	1
Japan	1
Philippines	1
United Kingdom (England)	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	2
ACT Assessment	1
Program for International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 24 results Save | Export

Multi-Group Regularized Gaussian Variational Estimation: Fast Detection of DIF

Peer reviewed

Direct link

Weicong Lyu; Chun Wang; Gongjun Xu – Grantee Submission, 2024

Data harmonization is an emerging approach to strategically combining data from multiple independent studies, enabling addressing new research questions that are not answerable by a single contributing study. A fundamental psychometric challenge for data harmonization is to create commensurate measures for the constructs of interest across…

Descriptors: Data Analysis, Test Items, Psychometrics, Item Response Theory

Modeling Computational Thinking Using Multidimensional Item Response Theory: Investigation into Model Fit and Measurement Invariance

Direct link

Emily A. Brown – ProQuest LLC, 2024

Previous research has been limited regarding the measurement of computational thinking, particularly as a learning progression in K-12. This study proposes to apply a multidimensional item response theory (IRT) model to a newly developed measure of computational thinking utilizing both selected response and open-ended polytomous items to establish…

Descriptors: Models, Computation, Thinking Skills, Item Response Theory

Conditional Precision of Measurement for Test Scores: Are Conditional Standard Errors Sufficient?

Peer reviewed

Direct link

Nicewander, W. Alan – Educational and Psychological Measurement, 2019

This inquiry is focused on three indicators of the precision of measurement--conditional on fixed values of ?, the latent variable of item response theory (IRT). The indicators that are compared are (1) The traditional, conditional standard errors, s(eX|?) = CSEM; (2) the IRT-based conditional standard errors, s[subscript irt](eX|?)=C[subscript…

Descriptors: Measurement, Accuracy, Scores, Error of Measurement

Selection of Common Items as an Unrecognized Source of Variability in Test Equating: A Bootstrap Approximation Assuming Random Sampling of Common Items

Peer reviewed

Direct link

Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014

The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…

Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference

Projective Item Response Model for Test-Independent Measurement

Peer reviewed

Direct link

Ip, Edward Hak-Sing; Chen, Shyh-Huei – Applied Psychological Measurement, 2012

The problem of fitting unidimensional item-response models to potentially multidimensional data has been extensively studied. The focus of this article is on response data that contains a major dimension of interest but that may also contain minor nuisance dimensions. Because fitting a unidimensional model to multidimensional data results in…

Descriptors: Measurement, Item Response Theory, Scores, Computation

Guessing and the Rasch Model

Peer reviewed

Direct link

Holster, Trevor A.; Lake, J. – Language Assessment Quarterly, 2016

Stewart questioned Beglar's use of Rasch analysis of the Vocabulary Size Test (VST) and advocated the use of 3-parameter logistic item response theory (3PLIRT) on the basis that it models a non-zero lower asymptote for items, often called a "guessing" parameter. In support of this theory, Stewart presented fit statistics derived from…

Descriptors: Guessing (Tests), Item Response Theory, Vocabulary, Language Tests

A Comparison of Linking Methods for Estimating National Trends in International Comparative Large-Scale Assessments in the Presence of Cross-national DIF

Peer reviewed

Direct link

Sachse, Karoline A.; Roppelt, Alexander; Haag, Nicole – Journal of Educational Measurement, 2016

Trend estimation in international comparative large-scale assessments relies on measurement invariance between countries. However, cross-national differential item functioning (DIF) has been repeatedly documented. We ran a simulation study using national item parameters, which required trends to be computed separately for each country, to compare…

Descriptors: Comparative Analysis, Measurement, Test Bias, Simulation

Measuring Latent Quantities

Peer reviewed

Direct link

McDonald, Roderick P. – Psychometrika, 2011

A distinction is proposed between measures and predictors of latent variables. The discussion addresses the consequences of the distinction for the true-score model, the linear factor model, Structural Equation Models, longitudinal and multilevel models, and item-response models. A distribution-free treatment of calibration and…

Descriptors: Measurement, Structural Equation Models, Item Response Theory, Error of Measurement

A Comparison of Four Approaches to Account for Method Effects in Latent State-Trait Analyses

Peer reviewed

Direct link

Geiser, Christian; Lockhart, Ginger – Psychological Methods, 2012

Latent state-trait (LST) analysis is frequently applied in psychological research to determine the degree to which observed scores reflect stable person-specific effects, effects of situations and/or person-situation interactions, and random measurement error. Most LST applications use multiple repeatedly measured observed variables as indicators…

Descriptors: Psychological Studies, Simulation, Measurement, Error of Measurement

Structure and Measurement of Depression in Youths: Applying Item Response Theory to Clinical Data

Peer reviewed

Direct link

Cole, David A.; Cai, Li; Martin, Nina C.; Findling, Robert L.; Youngstrom, Eric A.; Garber, Judy; Curry, John F.; Hyde, Janet S.; Essex, Marilyn J.; Compas, Bruce E.; Goodyer, Ian M.; Rohde, Paul; Stark, Kevin D.; Slattery, Marcia J.; Forehand, Rex – Psychological Assessment, 2011

Our goals in this article were to use item response theory (IRT) to assess the relation of depressive symptoms to the underlying dimension of depression and to demonstrate how IRT-based measurement strategies can yield more reliable data about depression severity than conventional symptom counts. Participants were 3,403 children and adolescents…

Descriptors: Schizophrenia, Measurement, Error of Measurement, Severity (of Disability)

A New Statistic for Evaluating Item Response Theory Models for Ordinal Data. CRESST Report 839

Download full text

Cai, Li; Monroe, Scott – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2014

We propose a new limited-information goodness of fit test statistic C[subscript 2] for ordinal IRT models. The construction of the new statistic lies formally between the M[subscript 2] statistic of Maydeu-Olivares and Joe (2006), which utilizes first and second order marginal probabilities, and the M*[subscript 2] statistic of Cai and Hansen…

Descriptors: Item Response Theory, Models, Goodness of Fit, Probability

Observed-Score Equating with a Heterogeneous Target Population

Peer reviewed

Direct link

Duong, Minh Q.; von Davier, Alina A. – International Journal of Testing, 2012

Test equating is a statistical procedure for adjusting for test form differences in difficulty in a standardized assessment. Equating results are supposed to hold for a specified target population (Kolen & Brennan, 2004; von Davier, Holland, & Thayer, 2004) and to be (relatively) independent of the subpopulations from the target population (see…

Descriptors: Ability Grouping, Difficulty Level, Psychometrics, Statistical Analysis

Nonparametric Item Response Curve Estimation with Correction for Measurement Error

Peer reviewed

Direct link

Guo, Hongwen; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2011

Nonparametric or kernel regression estimation of item response curves (IRCs) is often used in item analysis in testing programs. These estimates are biased when the observed scores are used as the regressor because the observed scores are contaminated by measurement error. Accuracy of this estimation is a concern theoretically and operationally.…

Descriptors: Testing Programs, Measurement, Item Analysis, Error of Measurement

Online Calibration via Variable Length Computerized Adaptive Testing

Peer reviewed

Direct link

Chang, Yuan-chin Ivan; Lu, Hung-Yi – Psychometrika, 2010

Item calibration is an essential issue in modern item response theory based psychological or educational testing. Due to the popularity of computerized adaptive testing, methods to efficiently calibrate new items have become more important than that in the time when paper and pencil test administration is the norm. There are many calibration…

Descriptors: Test Items, Educational Testing, Adaptive Testing, Measurement

A Response to an Article Published in "Educational Research"'s Special Issue on Assessment (June 2009). What Can Be Inferred about Classification Accuracy from Classification Consistency?

Peer reviewed

Direct link

Bramley, Tom – Educational Research, 2010

Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…

Descriptors: National Curriculum, Educational Research, Testing, Measurement

Previous Page | Next Page »

Pages: 1 | 2

Applied Measurement in…	2
Educational and Psychological…	2
Journal of Educational and…	2
Psychometrika	2
Applied Psychological…	1
Educational Research	1
Grantee Submission	1
Health Education Research	1
International Journal of…	1
Journal of Educational…	1
Language Assessment Quarterly	1
Mathematics Education…	1
Multivariate Behavioral…	1
National Center for Analysis…	1
National Center for Research…	1
Online Submission	1
ProQuest LLC	1
Psychological Assessment	1
Psychological Methods	1
School Effectiveness and…	1
More ▼

Cai, Li	2
Allen, Diane D.	1
Boyd, Don	1
Bramley, Tom	1
Briggs, Derek C.	1
Chang, Yuan-chin Ivan	1
Chen, Shyh-Huei	1
Chun Wang	1
Cole, David A.	1
Compas, Bruce E.	1
Culpepper, Steven Andrew	1
Curry, John F.	1
Duong, Minh Q.	1
Emily A. Brown	1
Essex, Marilyn J.	1
Findling, Robert L.	1
Forehand, Rex	1
Fox, Jean-Paul	1
Garber, Judy	1
Geiser, Christian	1
Gongjun Xu	1
Goodyer, Ian M.	1
Grossman, Pam	1
Guo, Hongwen	1
Haag, Nicole	1
More ▼