Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 23 |
Descriptor
Error of Measurement | 24 |
Item Response Theory | 24 |
Measurement | 24 |
Computation | 9 |
Test Items | 9 |
Comparative Analysis | 6 |
Models | 6 |
Simulation | 6 |
Foreign Countries | 5 |
Psychometrics | 5 |
Difficulty Level | 4 |
More ▼ |
Source
Author
Cai, Li | 2 |
Allen, Diane D. | 1 |
Boyd, Don | 1 |
Bramley, Tom | 1 |
Briggs, Derek C. | 1 |
Chang, Yuan-chin Ivan | 1 |
Chen, Shyh-Huei | 1 |
Chun Wang | 1 |
Cole, David A. | 1 |
Compas, Bruce E. | 1 |
Culpepper, Steven Andrew | 1 |
More ▼ |
Publication Type
Journal Articles | 19 |
Reports - Research | 11 |
Reports - Evaluative | 9 |
Reports - Descriptive | 2 |
Dissertations/Theses -… | 1 |
Reports - General | 1 |
Education Level
Elementary Secondary Education | 6 |
Higher Education | 3 |
Grade 8 | 2 |
Junior High Schools | 2 |
Postsecondary Education | 2 |
Secondary Education | 2 |
Adult Education | 1 |
Grade 10 | 1 |
Middle Schools | 1 |
Audience
Researchers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 2 |
ACT Assessment | 1 |
Program for International… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Weicong Lyu; Chun Wang; Gongjun Xu – Grantee Submission, 2024
Data harmonization is an emerging approach to strategically combining data from multiple independent studies, enabling addressing new research questions that are not answerable by a single contributing study. A fundamental psychometric challenge for data harmonization is to create commensurate measures for the constructs of interest across…
Descriptors: Data Analysis, Test Items, Psychometrics, Item Response Theory
Emily A. Brown – ProQuest LLC, 2024
Previous research has been limited regarding the measurement of computational thinking, particularly as a learning progression in K-12. This study proposes to apply a multidimensional item response theory (IRT) model to a newly developed measure of computational thinking utilizing both selected response and open-ended polytomous items to establish…
Descriptors: Models, Computation, Thinking Skills, Item Response Theory
Nicewander, W. Alan – Educational and Psychological Measurement, 2019
This inquiry is focused on three indicators of the precision of measurement--conditional on fixed values of ?, the latent variable of item response theory (IRT). The indicators that are compared are (1) The traditional, conditional standard errors, s(eX|?) = CSEM; (2) the IRT-based conditional standard errors, s[subscript irt](eX|?)=C[subscript…
Descriptors: Measurement, Accuracy, Scores, Error of Measurement
Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014
The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…
Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference
Ip, Edward Hak-Sing; Chen, Shyh-Huei – Applied Psychological Measurement, 2012
The problem of fitting unidimensional item-response models to potentially multidimensional data has been extensively studied. The focus of this article is on response data that contains a major dimension of interest but that may also contain minor nuisance dimensions. Because fitting a unidimensional model to multidimensional data results in…
Descriptors: Measurement, Item Response Theory, Scores, Computation
Holster, Trevor A.; Lake, J. – Language Assessment Quarterly, 2016
Stewart questioned Beglar's use of Rasch analysis of the Vocabulary Size Test (VST) and advocated the use of 3-parameter logistic item response theory (3PLIRT) on the basis that it models a non-zero lower asymptote for items, often called a "guessing" parameter. In support of this theory, Stewart presented fit statistics derived from…
Descriptors: Guessing (Tests), Item Response Theory, Vocabulary, Language Tests
Sachse, Karoline A.; Roppelt, Alexander; Haag, Nicole – Journal of Educational Measurement, 2016
Trend estimation in international comparative large-scale assessments relies on measurement invariance between countries. However, cross-national differential item functioning (DIF) has been repeatedly documented. We ran a simulation study using national item parameters, which required trends to be computed separately for each country, to compare…
Descriptors: Comparative Analysis, Measurement, Test Bias, Simulation
McDonald, Roderick P. – Psychometrika, 2011
A distinction is proposed between measures and predictors of latent variables. The discussion addresses the consequences of the distinction for the true-score model, the linear factor model, Structural Equation Models, longitudinal and multilevel models, and item-response models. A distribution-free treatment of calibration and…
Descriptors: Measurement, Structural Equation Models, Item Response Theory, Error of Measurement
Geiser, Christian; Lockhart, Ginger – Psychological Methods, 2012
Latent state-trait (LST) analysis is frequently applied in psychological research to determine the degree to which observed scores reflect stable person-specific effects, effects of situations and/or person-situation interactions, and random measurement error. Most LST applications use multiple repeatedly measured observed variables as indicators…
Descriptors: Psychological Studies, Simulation, Measurement, Error of Measurement
Cole, David A.; Cai, Li; Martin, Nina C.; Findling, Robert L.; Youngstrom, Eric A.; Garber, Judy; Curry, John F.; Hyde, Janet S.; Essex, Marilyn J.; Compas, Bruce E.; Goodyer, Ian M.; Rohde, Paul; Stark, Kevin D.; Slattery, Marcia J.; Forehand, Rex – Psychological Assessment, 2011
Our goals in this article were to use item response theory (IRT) to assess the relation of depressive symptoms to the underlying dimension of depression and to demonstrate how IRT-based measurement strategies can yield more reliable data about depression severity than conventional symptom counts. Participants were 3,403 children and adolescents…
Descriptors: Schizophrenia, Measurement, Error of Measurement, Severity (of Disability)
Cai, Li; Monroe, Scott – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2014
We propose a new limited-information goodness of fit test statistic C[subscript 2] for ordinal IRT models. The construction of the new statistic lies formally between the M[subscript 2] statistic of Maydeu-Olivares and Joe (2006), which utilizes first and second order marginal probabilities, and the M*[subscript 2] statistic of Cai and Hansen…
Descriptors: Item Response Theory, Models, Goodness of Fit, Probability
Duong, Minh Q.; von Davier, Alina A. – International Journal of Testing, 2012
Test equating is a statistical procedure for adjusting for test form differences in difficulty in a standardized assessment. Equating results are supposed to hold for a specified target population (Kolen & Brennan, 2004; von Davier, Holland, & Thayer, 2004) and to be (relatively) independent of the subpopulations from the target population (see…
Descriptors: Ability Grouping, Difficulty Level, Psychometrics, Statistical Analysis
Guo, Hongwen; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2011
Nonparametric or kernel regression estimation of item response curves (IRCs) is often used in item analysis in testing programs. These estimates are biased when the observed scores are used as the regressor because the observed scores are contaminated by measurement error. Accuracy of this estimation is a concern theoretically and operationally.…
Descriptors: Testing Programs, Measurement, Item Analysis, Error of Measurement
Chang, Yuan-chin Ivan; Lu, Hung-Yi – Psychometrika, 2010
Item calibration is an essential issue in modern item response theory based psychological or educational testing. Due to the popularity of computerized adaptive testing, methods to efficiently calibrate new items have become more important than that in the time when paper and pencil test administration is the norm. There are many calibration…
Descriptors: Test Items, Educational Testing, Adaptive Testing, Measurement
Bramley, Tom – Educational Research, 2010
Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…
Descriptors: National Curriculum, Educational Research, Testing, Measurement
Previous Page | Next Page ยป
Pages: 1 | 2