Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 16 |
Descriptor
Difficulty Level | 17 |
Test Items | 17 |
Item Response Theory | 8 |
Foreign Countries | 7 |
Test Bias | 5 |
Models | 4 |
Cognitive Processes | 3 |
Psychometrics | 3 |
Test Format | 3 |
Adults | 2 |
Attention | 2 |
More ▼ |
Source
International Journal of… | 17 |
Author
Allalouf, Avi | 1 |
Arce, Alvaro J. | 1 |
Aryadoust, Vahid | 1 |
Baghaei, Purya | 1 |
Bryant, Damon U. | 1 |
Buckendahl, Chad W. | 1 |
Bulut, Okan | 1 |
Cole, Ki Lynn | 1 |
Davis-Becker, Susan L. | 1 |
DeMars, Christine E. | 1 |
Emons, Wilco H. M. | 1 |
More ▼ |
Publication Type
Journal Articles | 17 |
Reports - Research | 11 |
Reports - Evaluative | 6 |
Education Level
Secondary Education | 4 |
Elementary Education | 1 |
Elementary Secondary Education | 1 |
Grade 3 | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
Grade 6 | 1 |
Grade 7 | 1 |
High Schools | 1 |
Higher Education | 1 |
Intermediate Grades | 1 |
More ▼ |
Audience
Location
China | 1 |
Iran | 1 |
Malaysia | 1 |
Netherlands | 1 |
Philippines | 1 |
Singapore | 1 |
Slovakia | 1 |
Turkey | 1 |
United Kingdom (England) | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Big Five Inventory | 1 |
International English… | 1 |
National Assessment of… | 1 |
Program for International… | 1 |
Test of English for… | 1 |
What Works Clearinghouse Rating
Patrik Havan; Michal Kohút; Peter Halama – International Journal of Testing, 2025
Acquiescence is the tendency of participants to shift their responses to agreement. Lechner et al. (2019) introduced the following mechanisms of acquiescence: social deference and cognitive processing. We added their interaction into a theoretical framework. The sample consists of 557 participants. We found significant medium strong relationship…
Descriptors: Cognitive Processes, Attention, Difficulty Level, Reflection
Huggins-Manley, Anne Corinne; Qiu, Yuxi; Penfield, Randall D. – International Journal of Testing, 2018
Score equity assessment (SEA) refers to an examination of population invariance of equating across two or more subpopulations of test examinees. Previous SEA studies have shown that score equity may be present for examinees scoring at particular test score ranges but absent for examinees scoring at other score ranges. No studies to date have…
Descriptors: Equated Scores, Test Bias, Test Items, Difficulty Level
Roelofs, Erik C.; Emons, Wilco H. M.; Verschoor, Angela J. – International Journal of Testing, 2021
This study reports on an Evidence Centered Design (ECD) project in the Netherlands, involving the theory exam for prospective car drivers. In particular, we illustrate how cognitive load theory, task-analysis, response process models, and explanatory item-response theory can be used to systematically develop and refine task models. Based on a…
Descriptors: Foreign Countries, Psychometrics, Test Items, Evidence Based Practice
FIPC Linking across Multidimensional Test Forms: Effects of Confounding Difficulty within Dimensions
Kim, Sohee; Cole, Ki Lynn; Mwavita, Mwarumba – International Journal of Testing, 2018
This study investigated the effects of linking potentially multidimensional test forms using the fixed item parameter calibration. Forms had equal or unequal total test difficulty with and without confounding difficulty. The mean square errors and bias of estimated item and ability parameters were compared across the various confounding tests. The…
Descriptors: Test Items, Item Response Theory, Test Format, Difficulty Level
Holmes, Stephen D.; Meadows, Michelle; Stockford, Ian; He, Qingping – International Journal of Testing, 2018
The relationship of expected and actual difficulty of items on six mathematics question papers designed for 16-year olds in England was investigated through paired comparison using experts and testing with students. A variant of the Rasch model was applied to the comparison data to establish a scale of expected difficulty. In testing, the papers…
Descriptors: Foreign Countries, Secondary School Students, Mathematics Tests, Test Items
Wang, Ting; Li, Min; Thummaphan, Phonraphee; Ruiz-Primo, Maria Araceli – International Journal of Testing, 2017
Contextualized items have been widely used in science testing. Despite common use of item contexts, how the influence of a chosen context on the reliability and validity of the score inferences remains unclear. We focused on sequential cues of contextual information, referring to the order of events or descriptions presented in item contexts. We…
Descriptors: Science Tests, Cues, Difficulty Level, Test Items
Solano-Flores, Guillermo; Wang, Chao; Shade, Chelsey – International Journal of Testing, 2016
We examined multimodality (the representation of information in multiple semiotic modes) in the context of international test comparisons. Using Program of International Student Assessment (PISA)-2009 data, we examined the correlation of the difficulty of science items and the complexity of their illustrations. We observed statistically…
Descriptors: Semiotics, Difficulty Level, Test Items, Science Tests
Baghaei, Purya; Aryadoust, Vahid – International Journal of Testing, 2015
Research shows that test method can exert a significant impact on test takers' performance and thereby contaminate test scores. We argue that common test method can exert the same effect as common stimuli and violate the conditional independence assumption of item response theory models because, in general, subsets of items which have a shared…
Descriptors: Test Format, Item Response Theory, Models, Test Items
Kan, Adnan; Bulut, Okan – International Journal of Testing, 2014
This study investigated whether the linguistic complexity of items leads to gender differential item functioning (DIF) on mathematics assessments. Two forms of a mathematics test were developed. The first form consisted of algebra items based on mathematical expressions, terms, and equations. In the second form, the same items were written as word…
Descriptors: Gender Differences, Test Bias, Difficulty Level, Test Items
Davis-Becker, Susan L.; Buckendahl, Chad W.; Gerrow, Jack – International Journal of Testing, 2011
Throughout the world, cut scores are an important aspect of a high-stakes testing program because they are a key operational component of the interpretation of test scores. One method for setting standards that is prevalent in educational testing programs--the Bookmark method--is intended to be a less cognitively complex alternative to methods…
Descriptors: Standard Setting (Scoring), Cutting Scores, Educational Testing, Licensing Examinations (Professions)
Arce, Alvaro J.; Wang, Ze – International Journal of Testing, 2012
The traditional approach to scale modified-Angoff cut scores transfers the raw cuts to an existing raw-to-scale score conversion table. Under the traditional approach, cut scores and conversion table raw scores are not only seen as interchangeable but also as originating from a common scaling process. In this article, we propose an alternative…
Descriptors: Generalizability Theory, Item Response Theory, Cutting Scores, Scaling
DeMars, Christine E.; Wise, Steven L. – International Journal of Testing, 2010
This investigation examined whether different rates of rapid guessing between groups could lead to detectable levels of differential item functioning (DIF) in situations where the item parameters were the same for both groups. Two simulation studies were designed to explore this possibility. The groups in Study 1 were simulated to reflect…
Descriptors: Guessing (Tests), Test Bias, Motivation, Gender Differences
Svetina, Dubravka; Gorin, Joanna S.; Tatsuoka, Kikumi K. – International Journal of Testing, 2011
As a construct definition, the current study develops a cognitive model describing the knowledge, skills, and abilities measured by critical reading test items on a high-stakes assessment used for selection decisions in the United States. Additionally, in order to establish generalizability of the construct meaning to other similarly structured…
Descriptors: Reading Tests, Reading Comprehension, Critical Reading, Test Items
Lamprianou, Iasonas – International Journal of Testing, 2008
This study investigates the effect of reporting the unadjusted raw scores in a high-stakes language exam when raters differ significantly in severity and self-selected questions differ significantly in difficulty. More sophisticated models, introducing meaningful facets and parameters, are successively used to investigate the characteristics of…
Descriptors: High Stakes Tests, Raw Scores, Item Response Theory, Language Tests
Allalouf, Avi; Rapp, Joel; Stoller, Reuven – International Journal of Testing, 2009
When a test is adapted from a source language (SL) into a target language (TL), the two forms are usually not psychometrically equivalent. If linking between test forms is necessary, those items that have had their psychometric characteristics altered by the translation (differential item functioning [DIF] items) should be eliminated from the…
Descriptors: Test Items, Test Format, Verbal Tests, Psychometrics
Previous Page | Next Page »
Pages: 1 | 2