Publication Date
In 2025 | 0 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 8 |
Since 2006 (last 20 years) | 9 |
Descriptor
Item Analysis | 10 |
Psychometrics | 10 |
Simulation | 10 |
Test Items | 7 |
Item Response Theory | 5 |
Achievement Tests | 3 |
Error of Measurement | 3 |
Goodness of Fit | 3 |
Computer Assisted Testing | 2 |
Correlation | 2 |
Factor Analysis | 2 |
More ▼ |
Source
Educational and Psychological… | 3 |
ETS Research Report Series | 1 |
Grantee Submission | 1 |
Journal of Educational… | 1 |
Psychological Record | 1 |
Structural Equation Modeling:… | 1 |
Studies in Second Language… | 1 |
Author
Albano, Anthony D. | 1 |
Bronson Hui | 1 |
Cai, Liuhan | 1 |
Chengyu Cui | 1 |
Chun Wang | 1 |
Cooperman, Allison W. | 1 |
E. Damiano D'Urso | 1 |
Edwards, Michael C. | 1 |
Eslinger, Paul J. | 1 |
Frankel, Lois | 1 |
Gongjun Xu | 1 |
More ▼ |
Publication Type
Reports - Research | 10 |
Journal Articles | 8 |
Education Level
Elementary Education | 2 |
Elementary Secondary Education | 1 |
Grade 4 | 1 |
Intermediate Grades | 1 |
Audience
Location
Florida | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Big Five Inventory | 1 |
Comprehensive Tests of Basic… | 1 |
Florida Comprehensive… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Bronson Hui; Zhiyi Wu – Studies in Second Language Acquisition, 2024
A slowdown or a speedup in response times across experimental conditions can be taken as evidence of online deployment of knowledge. However, response-time difference measures are rarely evaluated on their reliability, and there is no standard practice to estimate it. In this article, we used three open data sets to explore an approach to…
Descriptors: Reliability, Reaction Time, Psychometrics, Criticism
E. Damiano D'Urso; Jesper Tijmstra; Jeroen K. Vermunt; Kim De Roover – Structural Equation Modeling: A Multidisciplinary Journal, 2024
Measurement invariance (MI) is required for validly comparing latent constructs measured by multiple ordinal self-report items. Non-invariances may occur when disregarding (group differences in) an acquiescence response style (ARS; an agreeing tendency regardless of item content). If non-invariance results solely from neglecting ARS, one should…
Descriptors: Error of Measurement, Structural Equation Models, Construct Validity, Measurement Techniques
Cooperman, Allison W.; Weiss, David J.; Wang, Chun – Educational and Psychological Measurement, 2022
Adaptive measurement of change (AMC) is a psychometric method for measuring intra-individual change on one or more latent traits across testing occasions. Three hypothesis tests--a Z test, likelihood ratio test, and score ratio index--have demonstrated desirable statistical properties in this context, including low false positive rates and high…
Descriptors: Error of Measurement, Psychometrics, Hypothesis Testing, Simulation
Guo, Hongwen; Ling, Guangming; Frankel, Lois – ETS Research Report Series, 2020
With advances in technology, researchers and test developers are developing new item types to measure complex skills like problem solving and critical thinking. Analyzing such items is often challenging because of their complicated response patterns, and thus it is important to develop psychometric methods for practitioners and researchers to…
Descriptors: Test Construction, Test Items, Item Analysis, Psychometrics
Chengyu Cui; Chun Wang; Gongjun Xu – Grantee Submission, 2024
Multidimensional item response theory (MIRT) models have generated increasing interest in the psychometrics literature. Efficient approaches for estimating MIRT models with dichotomous responses have been developed, but constructing an equally efficient and robust algorithm for polytomous models has received limited attention. To address this gap,…
Descriptors: Item Response Theory, Accuracy, Simulation, Psychometrics
Albano, Anthony D.; Cai, Liuhan; Lease, Erin M.; McConnell, Scott R. – Journal of Educational Measurement, 2019
Studies have shown that item difficulty can vary significantly based on the context of an item within a test form. In particular, item position may be associated with practice and fatigue effects that influence item parameter estimation. The purpose of this research was to examine the relevance of item position specifically for assessments used in…
Descriptors: Test Items, Computer Assisted Testing, Item Analysis, Difficulty Level
Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2017
This study defines subpopulation item parameter drift (SIPD) as a change in item parameters over time that is dependent on subpopulations of examinees, and hypothesizes that the presence of SIPD in anchor items is associated with bias and/or lack of invariance in three psychometric outcomes. Results show that SIPD in anchor items is associated…
Descriptors: Psychometrics, Test Items, Item Response Theory, Hypothesis Testing
Stanley, Leanne M.; Edwards, Michael C. – Educational and Psychological Measurement, 2016
The purpose of this article is to highlight the distinction between the reliability of test scores and the fit of psychometric measurement models, reminding readers why it is important to consider both when evaluating whether test scores are valid for a proposed interpretation and/or use. It is often the case that an investigator judges both the…
Descriptors: Test Reliability, Goodness of Fit, Scores, Patients
Satish, Usha; Streufert, Siegfried; Eslinger, Paul J. – Psychological Record, 2006
Neuropsychological tests have limited sensitivity in identifying subtle residual cognitive impairments in patients with good medical recovery from head injury and post-concussive syndrome. Detecting and characterizing residual "real life" cognitive difficulties can be problematic for treatment purposes. This study investigated the usefulness of a…
Descriptors: Patients, Control Groups, Head Injuries, Neuropsychology
Yen, Wendy M. – 1979
Three test-analysis models were used to analyze three types of simulated test score data plus the results of eight achievement tests. Chi-square goodness-of-fit statistics were used to evaluate the appropriateness of the models to the four kinds of data. Data were generated to simulate the responses of 1,000 students to 36 pseudo-items by…
Descriptors: Achievement Tests, Correlation, Goodness of Fit, Item Analysis