Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 1 |
Descriptor
Source
International Journal of… | 2 |
Applied Measurement in… | 1 |
Education Policy Analysis… | 1 |
Education Sciences | 1 |
Educational Measurement:… | 1 |
International Journal of… | 1 |
Journal of Educational… | 1 |
Author
Sireci, Stephen G. | 16 |
Hambleton, Ronald K. | 3 |
Bhola, Dennison | 2 |
Harter, James | 2 |
Robin, Frederic | 2 |
Yang, Yongwei | 2 |
Allalouf, Avi | 1 |
Araneda, Sergio | 1 |
Arslan, Burcu | 1 |
Bastari, B. | 1 |
Berberoglu, Giray | 1 |
More ▼ |
Publication Type
Reports - Research | 9 |
Speeches/Meeting Papers | 9 |
Journal Articles | 8 |
Reports - Descriptive | 3 |
Reports - Evaluative | 3 |
Numerical/Quantitative Data | 1 |
Education Level
Early Childhood Education | 1 |
Elementary Education | 1 |
Grade 3 | 1 |
Grade 4 | 1 |
Intermediate Grades | 1 |
Primary Education | 1 |
Audience
Location
Israel | 1 |
Malawi | 1 |
Turkey | 1 |
United States | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Trends in International… | 1 |
What Works Clearinghouse Rating
Araneda, Sergio; Lee, Dukjae; Lewis, Jennifer; Sireci, Stephen G.; Moon, Jung Aa; Lehman, Blair; Arslan, Burcu; Keehner, Madeleine – Education Sciences, 2022
Students exhibit many behaviors when responding to items on a computer-based test, but only some of these behaviors are relevant to estimating their proficiencies. In this study, we analyzed data from computer-based math achievement tests administered to elementary school students in grades 3 (ages 8-9) and 4 (ages 9-10). We investigated students'…
Descriptors: Student Behavior, Academic Achievement, Computer Assisted Testing, Mathematics Achievement

Robin, Frederic; Sireci, Stephen G.; Hambleton, Ronald K. – International Journal of Testing, 2003
Illustrates how multidimensional scaling (MDS) and differential item functioning (DIF) procedures can be used to evaluate the equivalence of different language versions of an examination. Presents examples of structural differences and DIF across languages. (SLD)
Descriptors: Item Bias, Licensing Examinations (Professions), Multidimensional Scaling, Multilingual Materials
Sireci, Stephen G.; Bastari, B. – 1998
In many cross-cultural research studies, assessment instruments are translated or adapted for use in multiple languages. However, it cannot be assumed that different language versions of an assessment are equivalent across languages. A fundamental issue to be addressed is the comparability or equivalence of the construct measured by each language…
Descriptors: Construct Validity, Cross Cultural Studies, Evaluation Methods, Multidimensional Scaling

Sireci, Stephen G.; Harter, James; Yang, Yongwei; Bhola, Dennison – International Journal of Testing, 2003
Evaluated the structural equivalence and differential item functioning of an employee attitude survey from a large international corporation across three languages, eight cultures, and two mediums of administration. Results for 40,595 employees show the structure of survey data was consistent and items functioned similarly across all groups. (SLD)
Descriptors: Attitude Measures, Computer Assisted Testing, Cross Cultural Studies, Employees
Sireci, Stephen G.; Harter, James; Yang, Yongwei; Bhola, Dennison – 2000
Assessing people who operate in different languages necessitates the use of multiple language versions of an assessment. However, different language versions of an assessment are not necessarily equivalent. In this paper, the psychometric properties of different language versions on an international employee attitude survey are evaluated. This…
Descriptors: Analysis of Covariance, Attitude Measures, Attitudes, Construct Validity
Sireci, Stephen G. – 1995
Test developers continue to struggle with the technical and logistical problems inherent in assessing achievement across different languages. Many testing programs offer separate language versions of a test to evaluate the achievement of examinees in different language groups. However, comparisons of individuals who took different language…
Descriptors: Bilingualism, Educational Assessment, Equated Scores, Intercultural Communication
Sireci, Stephen G.; Swaminathan, Hariharan – 1996
Procedures for evaluating differential item functioning (DIF) are commonly used to investigate the statistical equivalence of items that are translated from one language to another. However, the methodology developed for detecting DIF is designed to evaluate the functioning of the same items administered to two groups. In evaluating the…
Descriptors: Cross Cultural Studies, Foreign Countries, International Education, Item Bias

Sireci, Stephen G.; Berberoglu, Giray – Applied Measurement in Education, 2000
Studied a method for investigating the equivalence of translated-adapted items using bilingual test takers through item response theory. Results from an English-Turkish course evaluation form completed by 688 Turkish students indicate that the methodology is effective in flagging items that function differentially across languages and informing…
Descriptors: Bilingualism, College Students, Evaluation Methods, Higher Education

Allalouf, Avi; Hambleton, Ronald K.; Sireci, Stephen G. – Journal of Educational Measurement, 1999
Focused on whether differential item functioning (DIF) is related to item type in translated test items and the causes of DIF using data from an Israeli college entrance test in Hebrew and a Russian translation. Results from 24,304 college applicants indicate that 34% of items functioned differently across items. (SLD)
Descriptors: College Applicants, College Entrance Examinations, Foreign Countries, Hebrew
Sireci, Stephen G.; Gonzalez, Eugenio J. – 2003
International comparative educational studies make use of test instruments originally developed in English by international panels of experts, but that are ultimately administered in the language of instruction of the students. The comparability of the different language versions of these assessments is a critical issue in validating the…
Descriptors: Academic Achievement, Comparative Analysis, Difficulty Level, International Education
Sireci, Stephen G.; Khaliq, Shameem Nyla – 2002
Many students in the United States who are required to take educational tests are not fully proficient in English. To address this problem, a state-mandated testing program created dual language English-Spanish versions of some of their tests. In this study, the psychometric properties of the English and dual language versions of a fourth-grade…
Descriptors: Item Bias, Language Proficiency, Limited English Speaking, Multidimensional Scaling
Chakwera, Elias; Khembo, Dafter; Sireci, Stephen G. – Education Policy Analysis Archives, 2004
In the United States, tests are held to high standards of quality. In developing countries such as Malawi, psychometricians must deal with these same high standards as well as several additional pressures such as widespread cheating, test administration difficulties due to challenging landscapes and poor resources, difficulties in reliably scoring…
Descriptors: Testing Programs, Testing, High Stakes Tests, Measurement

Sireci, Stephen G. – Educational Measurement: Issues and Practice, 1997
Different methodologies for linking tests across languages are reviewed and evaluated, focusing on monolingual item response theory, bilingual group designs, and matched monolingual group designs. These methods, although not without weaknesses, are superior for promoting score comparability than methods that rely on translation or expert judgment…
Descriptors: Bilingualism, Comparative Analysis, Cross Cultural Studies, Educational Assessment
Sireci, Stephen G.; Fitzgerald, Cyndy; Xing, Dehui – 1998
Adapting credentialing examinations for international uses involves translating tests for use in multiple languages. This paper explores methods for evaluating construct equivalence and item equivalence across different language versions of a test. These methods were applied to four different language versions (English, French, German, and…
Descriptors: Credentials, Engineers, Factor Analysis, Foreign Countries
Sireci, Stephen G.; Foster, David F.; Robin, Frederic; Olsen, James – 1997
Evaluating the comparability of a test administered in different languages is a difficult, if not impossible, task. Comparisons are problematic because observed differences in test performance between groups who take different language versions of a test could be due to a difference in difficulty between the tests, to cultural differences in test…
Descriptors: Adaptive Testing, Adults, Certification, Comparative Analysis
Previous Page | Next Page ยป
Pages: 1 | 2