Publication Date
| In 2026 | 0 |
| Since 2025 | 38 |
| Since 2022 (last 5 years) | 225 |
| Since 2017 (last 10 years) | 570 |
| Since 2007 (last 20 years) | 1377 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 110 |
| Practitioners | 107 |
| Teachers | 46 |
| Administrators | 25 |
| Policymakers | 24 |
| Counselors | 12 |
| Parents | 7 |
| Students | 7 |
| Support Staff | 4 |
| Community | 2 |
Location
| California | 61 |
| Canada | 60 |
| United States | 57 |
| Turkey | 47 |
| Australia | 43 |
| Florida | 34 |
| Germany | 26 |
| Texas | 26 |
| China | 25 |
| Netherlands | 25 |
| Iran | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
| Does not meet standards | 1 |
Gierl, Mark J.; Lai, Hollis; Li, Johnson – Educational Research and Evaluation, 2013
The purpose of this study is to evaluate the performance of CATSIB (Computer Adaptive Testing-Simultaneous Item Bias Test) for detecting differential item functioning (DIF) when items in the matching and studied subtest are administered adaptively in the context of a realistic multi-stage adaptive test (MST). MST was simulated using a 4-item…
Descriptors: Adaptive Testing, Test Bias, Computer Assisted Testing, Test Items
Teker, Gulsen Tasdelen; Dogan, Nuri – Educational Sciences: Theory and Practice, 2015
Reliability and differential item functioning (DIF) analyses were conducted on testlets displaying local item dependence in this study. The data set employed in the research was obtained from the answers given by 1,500 students to the 20 items included in six testlets given in English Proficiency Exam by the School of Foreign Languages of a state…
Descriptors: Foreign Countries, Test Items, Test Bias, Item Response Theory
Reddy, Linda A.; Dudek, Christopher M.; Fabiano, Gregory A.; Peters, Stephanie – School Psychology Quarterly, 2015
This article presents information about the construct validity and reliability of a new teacher self-report measure of classroom instructional and behavioral practices (the Classroom Strategies Scales-Teacher Form; CSS-T). The theoretical underpinnings and empirical basis for the instructional and behavioral management scales are presented.…
Descriptors: Measurement Techniques, Construct Validity, Test Validity, Test Reliability
Ravand, Hamdollah – Practical Assessment, Research & Evaluation, 2015
Cognitive diagnostic models (CDM) have been around for more than a decade but their application is far from widespread for mainly two reasons: (1) CDMs are novel, as compared to traditional IRT models. Consequently, many researchers lack familiarity with them and their properties, and (2) Software programs doing CDMs have been expensive and not…
Descriptors: Test Theory, Models, Computer Software, Open Source Technology
Socha, Alan; DeMars, Christine E.; Zilberberg, Anna; Phan, Ha – International Journal of Testing, 2015
The Mantel-Haenszel (MH) procedure is commonly used to detect items that function differentially for groups of examinees from various demographic and linguistic backgrounds--for example, in international assessments. As in some other DIF methods, the total score is used to match examinees on ability. In thin matching, each of the total score…
Descriptors: Test Items, Educational Testing, Evaluation Methods, Ability Grouping
Zou, Shen; Xu, Qian – Language Assessment Quarterly, 2017
Washback and fairness are interrelated in validity research, and thus an investigation into washback inevitably involves fairness. This article reports Phase One of a washback study of "Test for English Majors for Grade Eight" (TEM8). Phase One was a questionnaire survey administered to university program administrators. Two research…
Descriptors: Foreign Countries, Language Tests, English (Second Language), Test Bias
Alhaythami, Hassan; Karpinski, Aryn; Kirschner, Paul; Bolden, Edward – Journal of Interactive Learning Research, 2017
This study examined the psychometric properties of a social-networking site (SNS) activities scale (SNSAS) using Rasch Analysis. Items were also examined with Rasch Principal Components Analysis (PCA) and Differential Item Functioning (DIF) across groups of university students (i.e., males and females from the United States [US] and Europe; N =…
Descriptors: Social Media, Social Networks, Likert Scales, Psychometrics
Lian, Lim Hooi; Yew, Wun Thiam; Meng, Chew Cheng – International Education Studies, 2014
Currently, in order to reform the Malaysian education system, there have been a number of education policy initiatives launched by the Malaysian Ministry of Education (MOE). All these initiatives have encouraged and inculcated teaching and learning for creativity, critical, innovative and higher-order thinking skills rather than conceptual…
Descriptors: Foreign Countries, Educational Policy, Evaluation Methods, Teacher Competencies
Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza – Language Testing, 2014
This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…
Descriptors: Test Bias, Regression (Statistics), Statistical Significance, Language Tests
Lyse Langlois; Claire Lapointe; Pierre Valois; Astrid de Leeuw – Journal of Educational Administration, 2014
Purpose: This study had five objectives: explain the initial steps that led to the construction of the Ethical Leadership Questionnaire (ELQ); analyze the items and verify the ELQ reliability using item response theory (IRT); examine its factorial structure with a confirmatory factor analysis (CFA) and an exploratory structural equation modeling…
Descriptors: Test Construction, Test Reliability, Test Validity, Questionnaires
Coster, Wendy J.; Kramer, Jessica M.; Tian, Feng; Dooley, Meghan; Liljenquist, Kendra; Kao, Ying-Chia; Ni, Pengsheng – Autism: The International Journal of Research and Practice, 2016
The Pediatric Evaluation of Disability Inventory-Computer Adaptive Test is an alternative method for describing the adaptive function of children and youth with disabilities using a computer-administered assessment. This study evaluated the performance of the Pediatric Evaluation of Disability Inventory-Computer Adaptive Test with a national…
Descriptors: Autism, Pervasive Developmental Disorders, Computer Assisted Testing, Adaptive Testing
Cheng, Maurice M. W.; Oon, Pey-Tee – International Journal of Science Education, 2016
This paper reports the results of a survey of 3006 Year 10-12 students on their understandings of metallic bonding. The instrument was developed based on Chi's ontological categories of scientific concepts and students' understanding of metallic bonding as reported in the literature. The instrument has two parts. Part one probed into students'…
Descriptors: Chemistry, Item Response Theory, Science Instruction, Foreign Countries
Letukas, Lynn – College Board, 2015
The purpose of this document is to identify and dispel rumors that are frequently cited about the SAT. The following is a compilation of nine popular rumors organized into three areas: "Student Demographics," "Test Preparation/Test Prediction," and "Test Utilization."
Descriptors: College Entrance Examinations, Student Characteristics, Test Preparation, Prediction
Raykov, Tenko; Marcoulides, George A.; Lee, Chun-Lung; Chang, Chi – Educational and Psychological Measurement, 2013
This note is concerned with a latent variable modeling approach for the study of differential item functioning in a multigroup setting. A multiple-testing procedure that can be used to evaluate group differences in response probabilities on individual items is discussed. The method is readily employed when the aim is also to locate possible…
Descriptors: Test Bias, Statistical Analysis, Models, Hypothesis Testing
Jin, Ying – ProQuest LLC, 2013
Previous research has demonstrated that DIF methods that do not account for multilevel data structure could result in too frequent rejection of the null hypothesis (i.e., no DIF) when the intraclass correlation coefficient (?) of the studied item was the same as ? of the total score. The current study extended previous research by comparing the…
Descriptors: Test Bias, Models, Correlation, Test Items

Peer reviewed
Direct link
