Publication Date
In 2025 | 2 |
Since 2024 | 8 |
Since 2021 (last 5 years) | 27 |
Since 2016 (last 10 years) | 78 |
Since 2006 (last 20 years) | 205 |
Descriptor
Item Response Theory | 205 |
Scaling | 157 |
Test Items | 79 |
Models | 48 |
Multidimensional Scaling | 48 |
Test Construction | 44 |
Psychometrics | 37 |
Simulation | 37 |
Scores | 35 |
Foreign Countries | 34 |
Item Analysis | 32 |
More ▼ |
Source
Author
Anderson, Daniel | 7 |
Alonzo, Julie | 6 |
Irvin, P. Shawn | 6 |
Park, Bitnara Jasmine | 6 |
Saven, Jessica L. | 6 |
Tindal, Gerald | 6 |
Wind, Stefanie A. | 5 |
DeMars, Christine E. | 4 |
Briggs, Derek C. | 3 |
Cai, Li | 3 |
Carstensen, Claus H. | 3 |
More ▼ |
Publication Type
Education Level
Audience
Location
Germany | 6 |
Florida | 5 |
New York | 4 |
Turkey | 4 |
Arizona | 3 |
Australia | 3 |
Austria | 3 |
California | 3 |
China | 3 |
Colorado | 3 |
North Carolina | 3 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Jörg-Henrik Heine; Moritz Heene – Measurement: Interdisciplinary Research and Perspectives, 2025
This paper critically evaluates the quantification of psychological attributes through metric measurement. Drawing on epistemological considerations by Immanuel Kant, the development of measurement theory in the natural and social sciences is outlined. This includes an examination of Fechner's psychophysical law and the fundamental criticism…
Descriptors: Measurement, Scaling, Psychological Testing, Psychological Characteristics
Stefanie A. Wind; Benjamin Lugu; Yurou Wang – International Journal of Testing, 2025
Mokken Scale Analysis (MSA) is a nonparametric approach that offers exploratory tools for understanding the nature of item responses while emphasizing invariance requirements. MSA is often discussed as it relates to Rasch measurement theory, which also emphasizes invariance, but uses parametric models. Researchers who have compared and combined…
Descriptors: Item Response Theory, Scaling, Surveys, Evaluation Methods
Güler Yavuz Temel – Journal of Educational Measurement, 2024
The purpose of this study was to investigate multidimensional DIF with a simple and nonsimple structure in the context of multidimensional Graded Response Model (MGRM). This study examined and compared the performance of the IRT-LR and Wald test using MML-EM and MHRM estimation approaches with different test factors and test structures in…
Descriptors: Computation, Multidimensional Scaling, Item Response Theory, Models
Hongwen Guo; Matthew S. Johnson; Daniel F. McCaffrey; Lixong Gu – ETS Research Report Series, 2024
The multistage testing (MST) design has been gaining attention and popularity in educational assessments. For testing programs that have small test-taker samples, it is challenging to calibrate new items to replenish the item pool. In the current research, we used the item pools from an operational MST program to illustrate how research studies…
Descriptors: Test Items, Test Construction, Sample Size, Scaling
Strachan, Tyler; Cho, Uk Hyun; Kim, Kyung Yong; Willse, John T.; Chen, Shyh-Huei; Ip, Edward H.; Ackerman, Terry A.; Weeks, Jonathan P. – Journal of Educational Measurement, 2021
In vertical scaling, results of tests from several different grade levels are placed on a common scale. Most vertical scaling methodologies rely heavily on the assumption that the construct being measured is unidimensional. In many testing situations, however, such an assumption could be problematic. For instance, the construct measured at one…
Descriptors: Item Response Theory, Scaling, Tests, Construct Validity
Wind, Stefanie A. – Educational and Psychological Measurement, 2022
Researchers frequently use Mokken scale analysis (MSA), which is a nonparametric approach to item response theory, when they have relatively small samples of examinees. Researchers have provided some guidance regarding the minimum sample size for applications of MSA under various conditions. However, these studies have not focused on item-level…
Descriptors: Nonparametric Statistics, Item Response Theory, Sample Size, Test Items
Scharl, Anna; Zink, Eva – Large-scale Assessments in Education, 2022
Educational large-scale assessments (LSAs) often provide plausible values for the administered competence tests to facilitate the estimation of population effects. This requires the specification of a background model that is appropriate for the specific research question. Because the "German National Educational Panel Study" (NEPS) is…
Descriptors: National Competency Tests, Foreign Countries, Programming Languages, Longitudinal Studies
Annamaria Di Fabio; Andrea Svicher – Journal of Psychoeducational Assessment, 2024
The Eco-Generativity Scale (EGS) is a recently developed 28-item scale derived from a 4-factor higher-order model (ecological generativity, social generativity, environmental identity, and agency/pathways). The aim of this study was to develop a short-scale version of the EGS to facilitate its use with university students (N = 779) who will…
Descriptors: Foreign Countries, College Students, Ecology, Likert Scales
An Analysis of Differential Bundle Functioning in Multidimensional Tests Using the SIBTEST Procedure
Özdogan, Didem; Kelecioglu, Hülya – International Journal of Assessment Tools in Education, 2022
This study aims to analyze the differential bundle functioning in multidimensional tests with a specific purpose to detect this effect through differentiating the location of the item with DIF in the test, the correlation between the dimensions, the sample size, and the ratio of reference to focal group size. The first 10 items of the test that is…
Descriptors: Correlation, Sample Size, Test Items, Item Analysis
Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023
This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…
Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation
Viola Merhof; Caroline M. Böhm; Thorsten Meiser – Educational and Psychological Measurement, 2024
Item response tree (IRTree) models are a flexible framework to control self-reported trait measurements for response styles. To this end, IRTree models decompose the responses to rating items into sub-decisions, which are assumed to be made on the basis of either the trait being measured or a response style, whereby the effects of such person…
Descriptors: Item Response Theory, Test Interpretation, Test Reliability, Test Validity
Ozberk, Eren Halil; Unsal Ozberk, Elif Bengi; Uluc, Sait; Oktem, Ferhunde – International Journal of Assessment Tools in Education, 2021
The Kaufman Brief Intelligence Test--Second Edition (KBIT-2) is designed to measure verbal and nonverbal abilities in a wide range of individuals from 4 years 0 months to 90 years 11 months of age. This study examines both the advantages of using Mokken Scale Analysis (MSA) in intelligence tests and the hierarchical order of the items in the…
Descriptors: Intelligence Tests, Nonparametric Statistics, Test Items, Test Construction
Livingston, Samuel A. – Educational Testing Service, 2020
This booklet is a conceptual introduction to item response theory (IRT), which many large-scale testing programs use for constructing and scoring their tests. Although IRT is essentially mathematical, the approach here is nonmathematical, in order to serve as an introduction on the topic for people who want to understand why IRT is used and what…
Descriptors: Item Response Theory, Scoring, Test Items, Scaling
Yu-Tzu Chang; Ann Tai Choe; Daniel Holden; Daniel R. Isbell – Language Testing, 2024
In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016-2021, including 434 essays) using…
Descriptors: Language Tests, Rating Scales, Second Language Learning, English (Second Language)
Lubbe, Dirk; Schuster, Christof – Journal of Educational and Behavioral Statistics, 2020
Extreme response style is the tendency of individuals to prefer the extreme categories of a rating scale irrespective of item content. It has been shown repeatedly that individual response style differences affect the reliability and validity of item responses and should, therefore, be considered carefully. To account for extreme response style…
Descriptors: Response Style (Tests), Rating Scales, Item Response Theory, Models