Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 15 |
Descriptor
Foreign Countries | 15 |
Hierarchical Linear Modeling | 15 |
Item Response Theory | 15 |
Achievement Tests | 5 |
Computation | 5 |
Measurement | 5 |
Secondary School Students | 5 |
Test Items | 5 |
Correlation | 4 |
Longitudinal Studies | 4 |
Test Bias | 4 |
More ▼ |
Source
Author
Adams, Raymond J. | 1 |
Artur Pokropek | 1 |
Bennink, Margot | 1 |
Berezner, Alla | 1 |
Carmen Köhler | 1 |
Croon, Marcel A. | 1 |
Cui, Ying | 1 |
Faber, Janke M. | 1 |
Fox, Jean-Paul | 1 |
Frey, Andreas | 1 |
Ganesan, Asha | 1 |
More ▼ |
Publication Type
Journal Articles | 14 |
Reports - Research | 13 |
Dissertations/Theses -… | 1 |
Reports - Descriptive | 1 |
Tests/Questionnaires | 1 |
Education Level
Audience
Location
Germany | 3 |
Netherlands | 2 |
Canada | 1 |
Greece | 1 |
Iran | 1 |
Italy | 1 |
Malaysia | 1 |
Qatar | 1 |
South Korea | 1 |
Taiwan | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 3 |
Trends in International… | 2 |
Childrens Manifest Anxiety… | 1 |
Raven Progressive Matrices | 1 |
Students Evaluation of… | 1 |
What Works Clearinghouse Rating
Carmen Köhler; Lale Khorramdel; Artur Pokropek; Johannes Hartig – Journal of Educational Measurement, 2024
For assessment scales applied to different groups (e.g., students from different states; patients in different countries), multigroup differential item functioning (MG-DIF) needs to be evaluated in order to ensure that respondents with the same trait level but from different groups have equal response probabilities on a particular item. The…
Descriptors: Measures (Individuals), Test Bias, Models, Item Response Theory
Naumann, Alexander; Hartig, Johannes; Hochweber, Jan – Journal of Educational and Behavioral Statistics, 2017
Valid inferences on teaching drawn from students' test scores require that tests are sensitive to the instruction students received in class. Accordingly, measures of the test items' instructional sensitivity provide empirical support for validity claims about inferences on instruction. In the present study, we first introduce the concepts of…
Descriptors: Test Items, Item Response Theory, Instructional Effectiveness, Psychometrics
Kiat, John Emmanuel; Ong, Ai Rene; Ganesan, Asha – Educational Psychology, 2018
Multiple-choice questions (MCQs) play a key role in standardised testing and in-class assessment. Research into the influence of within-item response order on MCQ characteristics has been mixed. While some researchers have shown preferential selection of response options presented earlier in the answer list, others have failed to replicate these…
Descriptors: Undergraduate Students, Multiple Choice Tests, Attention Control, Item Response Theory
Sulis, Isabella; Toland, Michael D. – Journal of Early Adolescence, 2017
Item response theory (IRT) models are the main psychometric approach for the development, evaluation, and refinement of multi-item instruments and scaling of latent traits, whereas multilevel models are the primary statistical method when considering the dependence between person responses when primary units (e.g., students) are nested within…
Descriptors: Hierarchical Linear Modeling, Item Response Theory, Psychometrics, Evaluation Methods
Ravand, Hamdollah – Practical Assessment, Research & Evaluation, 2015
Multilevel models (MLMs) are flexible in that they can be employed to obtain item and person parameters, test for differential item functioning (DIF) and capture both local item and person dependence. Papers on the MLM analysis of item response data have focused mostly on theoretical issues where applications have been add-ons to simulation…
Descriptors: Item Response Theory, Hierarchical Linear Modeling, Educational Testing, Reading Comprehension
Faber, Janke M.; Glas, Cees A. W.; Visscher, Adrie J. – School Effectiveness and School Improvement, 2018
In this study, the relationship between differentiated instruction, as an element of data-based decision making, and student achievement was examined. Classroom observations (n = 144) were used to measure teachers' differentiated instruction practices and to predict the mathematical achievement of 2nd- and 5th-grade students (n = 953). The…
Descriptors: Individualized Instruction, Data, Decision Making, Academic Achievement
Hecht, Martin; Weirich, Sebastian; Siegle, Thilo; Frey, Andreas – Educational and Psychological Measurement, 2015
The selection of an appropriate booklet design is an important element of large-scale assessments of student achievement. Two design properties that are typically optimized are the "balance" with respect to the positions the items are presented and with respect to the mutual occurrence of pairs of items in the same booklet. The purpose…
Descriptors: Measurement, Computation, Test Format, Test Items
Sideridis, Georgios D. – Educational and Psychological Measurement, 2016
The purpose of the present studies was to test the hypothesis that the psychometric characteristics of ability scales may be significantly distorted if one accounts for emotional factors during test taking. Specifically, the present studies evaluate the effects of anxiety and motivation on the item difficulties of the Rasch model. In Study 1, the…
Descriptors: Learning Disabilities, Test Validity, Measures (Individuals), Hierarchical Linear Modeling
Schmidt, Susanne; Zlatkin-Troitschanskaia, Olga; Fox, Jean-Paul – Journal of Educational Measurement, 2016
Longitudinal research in higher education faces several challenges. Appropriate methods of analyzing competence growth of students are needed to deal with those challenges and thereby obtain valid results. In this article, a pretest-posttest-posttest multivariate multilevel IRT model for repeated measures is introduced which is designed to address…
Descriptors: Foreign Countries, Pretests Posttests, Hierarchical Linear Modeling, Item Response Theory
Bennink, Margot; Croon, Marcel A.; Keuning, Jos; Vermunt, Jeroen K. – Journal of Educational and Behavioral Statistics, 2014
In educational measurement, responses of students on items are used not only to measure the ability of students, but also to evaluate and compare the performance of schools. Analysis should ideally account for the multilevel structure of the data, and school-level processes not related to ability, such as working climate and administration…
Descriptors: Academic Ability, Educational Assessment, Educational Testing, Test Bias
Cui, Ying; Mousavi, Amin – International Journal of Testing, 2015
The current study applied the person-fit statistic, l[subscript z], to data from a Canadian provincial achievement test to explore the usefulness of conducting person-fit analysis on large-scale assessments. Item parameter estimates were compared before and after the misfitting student responses, as identified by l[subscript z], were removed. The…
Descriptors: Measurement, Achievement Tests, Comparative Analysis, Test Items
Adams, Raymond J.; Lietz, Petra; Berezner, Alla – Large-scale Assessments in Education, 2013
Background: While rotated test booklets have been employed in large-scale assessments to increase the content coverage of the assessments, rotation has not yet been applied to the context questionnaires administered to respondents. Methods: This paper describes the development of a methodology that uses rotated context questionnaires in…
Descriptors: Questionnaires, Item Response Theory, Foreign Countries, Achievement Tests
Jeon, Minjeong – ProQuest LLC, 2012
Maximum likelihood (ML) estimation of generalized linear mixed models (GLMMs) is technically challenging because of the intractable likelihoods that involve high dimensional integrations over random effects. The problem is magnified when the random effects have a crossed design and thus the data cannot be reduced to small independent clusters. A…
Descriptors: Hierarchical Linear Modeling, Computation, Measurement, Maximum Likelihood Statistics
Huang, Hung-Yu; Wang, Wen-Chung – Educational and Psychological Measurement, 2014
In the social sciences, latent traits often have a hierarchical structure, and data can be sampled from multiple levels. Both hierarchical latent traits and multilevel data can occur simultaneously. In this study, we developed a general class of item response theory models to accommodate both hierarchical latent traits and multilevel data. The…
Descriptors: Item Response Theory, Hierarchical Linear Modeling, Computation, Test Reliability
Yen, Wendy M.; Lall, Venessa F.; Monfils, Lora – ETS Research Report Series, 2012
Alternatives to vertical scales are compared for measuring longitudinal academic growth and for producing school-level growth measures. The alternatives examined were empirical cross-grade regression, ordinary least squares and logistic regression, and multilevel models. The student data used for the comparisons were Arabic Grades 4 to 10 in…
Descriptors: Foreign Countries, Scaling, Item Response Theory, Test Interpretation