Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 8 |
Descriptor
Comparative Analysis | 9 |
Correlation | 9 |
Foreign Countries | 5 |
Item Response Theory | 4 |
Scores | 4 |
Achievement Tests | 2 |
English (Second Language) | 2 |
Error of Measurement | 2 |
Evaluation Methods | 2 |
Goodness of Fit | 2 |
Language | 2 |
More ▼ |
Source
International Journal of… | 9 |
Author
Publication Type
Journal Articles | 9 |
Reports - Research | 8 |
Reports - Evaluative | 1 |
Education Level
Secondary Education | 2 |
High Schools | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 2 |
Progress in International… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Xiaowen Liu – International Journal of Testing, 2024
Differential item functioning (DIF) often arises from multiple sources. Within the context of multidimensional item response theory, this study examined DIF items with varying secondary dimensions using the three DIF methods: SIBTEST, Mantel-Haenszel, and logistic regression. The effect of the number of secondary dimensions on DIF detection rates…
Descriptors: Item Analysis, Test Items, Item Response Theory, Correlation
Martin-Raugh, Michelle P.; Anguiano-Carrsaco, Cristina; Jackson, Teresa; Brenneman, Meghan W.; Carney, Lauren; Barnwell, Patrick; Kochert, Jonathan – International Journal of Testing, 2018
Single-response situational judgment tests (SRSJTs) differ from multiple-response SJTs (MRSJTS) in that they present test takers with edited critical incidents and simply ask test takers to read over the action described and evaluate it according to its effectiveness. Research comparing the reliability and validity of SRSJTs and MRSJTs is thus far…
Descriptors: Test Format, Test Reliability, Test Validity, Predictive Validity
Guo, Xiuyan; Lei, Pui-Wa – International Journal of Testing, 2020
Little research has been done on the effects of peer raters' quality characteristics on peer rating qualities. This study aims to address this gap and investigate the effects of key variables related to peer raters' qualities, including content knowledge, previous rating experience, training on rating tasks, and rating motivation. In an experiment…
Descriptors: Peer Evaluation, Error Patterns, Correlation, Knowledge Level
Ercikan, Kadriye; Chen, Michelle Y.; Lyons-Thomas, Juliette; Goodrich, Shawna; Sandilands, Debra; Roth, Wolff-Michael; Simon, Marielle – International Journal of Testing, 2015
The purpose of this research is to examine the comparability of mathematics and science scores for students from English language backgrounds (ELB) and non-English language backgrounds (NELB). We examine the relationship between English reading proficiency and performance on mathematics and science assessments in Australia, Canada, the United…
Descriptors: Scores, Mathematics Tests, Science Tests, Native Speakers
Sinharay, Sandip; Haberman, Shelby J. – International Journal of Testing, 2014
Recently there has been an increasing level of interest in subtest scores, or subscores, for their potential diagnostic value. Haberman (2008) suggested a method to determine if a subscore has added value over the total score. Researchers have often been interested in the performance of subgroups--for example, those based on gender or…
Descriptors: Scores, Achievement Tests, Language Tests, English (Second Language)
Oliveri, Maria Elena; von Davier, Matthias – International Journal of Testing, 2014
In this article, we investigate the creation of comparable score scales across countries in international assessments. We examine potential improvements to current score scale calibration procedures used in international large-scale assessments. Our approach seeks to improve fairness in scoring international large-scale assessments, which often…
Descriptors: Test Bias, Scores, International Programs, Educational Assessment
Asil, Mustafa; Brown, Gavin T. L. – International Journal of Testing, 2016
The use of the Programme for International Student Assessment (PISA) across nations, cultures, and languages has been criticized. The key criticisms point to the linguistic and cultural biases potentially underlying the design of reading comprehension tests, raising doubts about the legitimacy of comparisons across economies. Our research focused…
Descriptors: Comparative Analysis, Reading Achievement, Achievement Tests, Secondary School Students
Makransky, Guido; Glas, Cees A. W. – International Journal of Testing, 2013
Cognitive ability tests are widely used in organizations around the world because they have high predictive validity in selection contexts. Although these tests typically measure several subdomains, testing is usually carried out for a single subdomain at a time. This can be ineffective when the subdomains assessed are highly correlated. This…
Descriptors: Foreign Countries, Cognitive Ability, Adaptive Testing, Feedback (Response)
Cascallar, Alicia S.; Dorans, Neil J. – International Journal of Testing, 2005
This study compares two methods commonly used (concordance and prediction) to establish linkages between scores from tests of similar content given in different languages. Score linkages between the Verbal and Math sections of the SAT I and the corresponding sections of the Spanish-language admissions test, the Prueba de Aptitud Academica (PAA),…
Descriptors: Prediction, Correlation, Scores, Multilingual Materials