NotesFAQContact Us
Collection
Advanced
Search Tips
Education Level
Secondary Education1
Audience
Researchers1
Location
Germany2
Europe1
Laws, Policies, & Programs
Assessments and Surveys
Program for International…1
What Works Clearinghouse Rating
Showing all 15 results Save | Export
Chengcheng Li – ProQuest LLC, 2022
Categorical data become increasingly ubiquitous in the modern big data era. In this dissertation, we propose novel statistical learning and inference methods for large-scale categorical data, focusing on latent variable models and their applications to psychometrics. In psychometric assessments, the subjects' underlying aptitude often cannot be…
Descriptors: Statistical Inference, Data Analysis, Psychometrics, Raw Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Robitzsch, Alexander; Lüdtke, Oliver – Large-scale Assessments in Education, 2023
One major aim of international large-scale assessments (ILSA) like PISA is to monitor changes in student performance over time. To accomplish this task, a set of common items (i.e., link items) is repeatedly administered in each assessment. Linking methods based on item response theory (IRT) models are used to align the results from the different…
Descriptors: Educational Trends, Trend Analysis, International Assessment, Achievement Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Willse, John T. – Measurement and Evaluation in Counseling and Development, 2017
This article provides a brief introduction to the Rasch model. Motivation for using Rasch analyses is provided. Important Rasch model concepts and key aspects of result interpretation are introduced, with major points reinforced using a simulation demonstration. Concrete guidelines are provided regarding sample size and the evaluation of items.
Descriptors: Item Response Theory, Test Results, Test Interpretation, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
France, Stephen L.; Batchelder, William H. – Educational and Psychological Measurement, 2015
Cultural consensus theory (CCT) is a data aggregation technique with many applications in the social and behavioral sciences. We describe the intuition and theory behind a set of CCT models for continuous type data using maximum likelihood inference methodology. We describe how bias parameters can be incorporated into these models. We introduce…
Descriptors: Maximum Likelihood Statistics, Test Items, Difficulty Level, Test Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Ihme, Jan Marten; Senkbeil, Martin; Goldhammer, Frank; Gerick, Julia – European Educational Research Journal, 2017
The combination of different item formats is found quite often in large scale assessments, and analyses on the dimensionality often indicate multi-dimensionality of tests regarding the task format. In ICILS 2013, three different item types (information-based response tasks, simulation tasks, and authoring tasks) were used to measure computer and…
Descriptors: Foreign Countries, Computer Literacy, Information Literacy, International Assessment
Cai, Li – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2013
Lord and Wingersky's (1984) recursive algorithm for creating summed score based likelihoods and posteriors has a proven track record in unidimensional item response theory (IRT) applications. Extending the recursive algorithm to handle multidimensionality is relatively simple, especially with fixed quadrature because the recursions can be defined…
Descriptors: Mathematics, Scores, Item Response Theory, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Straat, J. Hendrik; van der Ark, L. Andries; Sijtsma, Klaas – Educational and Psychological Measurement, 2014
An automated item selection procedure in Mokken scale analysis partitions a set of items into one or more Mokken scales, if the data allow. Two algorithms are available that pursue the same goal of selecting Mokken scales of maximum length: Mokken's original automated item selection procedure (AISP) and a genetic algorithm (GA). Minimum…
Descriptors: Sampling, Test Items, Effect Size, Scaling
Peer reviewed Peer reviewed
Direct linkDirect link
Hartig, Johannes; Frey, Andreas; Nold, Gunter; Klieme, Eckhard – Educational and Psychological Measurement, 2012
The article compares three different methods to estimate effects of task characteristics and to use these estimates for model-based proficiency scaling: prediction of item difficulties from the Rasch model, the linear logistic test model (LLTM), and an LLTM including random item effects (LLTM+e). The methods are applied to empirical data from a…
Descriptors: Item Response Theory, Models, Methods, Computation
Kroopnick, Marc Howard – ProQuest LLC, 2010
When Item Response Theory (IRT) is operationally applied for large scale assessments, unidimensionality is typically assumed. This assumption requires that the test measures a single latent trait. Furthermore, when tests are vertically scaled using IRT, the assumption of unidimensionality would require that the battery of tests across grades…
Descriptors: Simulation, Scaling, Standard Setting, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Kane, Michael T.; Mroch, Andrew A.; Suh, Youngsuk; Ripkey, Douglas R. – Measurement: Interdisciplinary Research and Perspectives, 2009
This paper analyzes five linear equating models for the "nonequivalent groups with anchor test" (NEAT) design with internal anchors (i.e., the anchor test is part of the full test). The analysis employs a two-dimensional framework. The first dimension contrasts two general approaches to developing the equating relationship. Under a "parameter…
Descriptors: Scaling, Equated Scores, Methods, Test Items
Peer reviewed Peer reviewed
Davey, Tim; And Others – Applied Psychological Measurement, 1996
Scales defined by most item response theory (IRT) models are truly invariant with respect to certain linear transformations of parameters. The problem is to find the proper transformation to place calibrations on a common scale. This paper explores issues of extending and adapting unidimensional linking procedures to multidimensional IRT models.…
Descriptors: Equated Scores, Item Response Theory, Models, Scaling
Peer reviewed Peer reviewed
Meijer, Rob R. – Applied Psychological Measurement, 1995
A statistic used by R. Meijer (1994) to determine person-fit referred to the number of errors from the deterministic Guttman model (L. Guttman, 1950), but this was, in fact, based on the number of errors from the deterministic Guttman model as defined by J. Loevinger (1947, 1948). (SLD)
Descriptors: Difficulty Level, Models, Responses, Scaling
Peer reviewed Peer reviewed
Direct linkDirect link
Van Onna, Marieke J. H. – Applied Psychological Measurement, 2004
Coefficient "H" is used as an index of scalability in nonparametric item response theory (NIRT). It indicates the degree to which a set of items rank orders examinees. Theoretical sampling distributions, however, have only been derived asymptotically and only under restrictive conditions. Bootstrap methods offer an alternative possibility to…
Descriptors: Sampling, Item Response Theory, Scaling, Comparative Analysis
Peer reviewed Peer reviewed
Snyder, Scott; Sheehan, Robert – Journal of Early Intervention, 1992
This examination of the Rasch scaling model concludes that the model could potentially facilitate objective comparisons of status and change of young children with disabilities at individual and group levels. The paper discusses applications of the model to early childhood assessment in the areas of item banking, test analysis, and subject…
Descriptors: Disabilities, Evaluation Methods, Item Response Theory, Measurement Techniques
Abdel-fattah, Abdel-fattah A. – 1992
A scaling procedure is proposed, based on item response theory (IRT), to fit non-hierarchical test structure as well. The binary scores of a test of English were used for calculating the probabilities of answering each item correctly. The probability matrix was factor analyzed, and the difficulty intervals or estimates corresponding to the factors…
Descriptors: Bayesian Statistics, Difficulty Level, English, Estimation (Mathematics)