Publication Date
| In 2026 | 0 |
| Since 2025 | 38 |
| Since 2022 (last 5 years) | 225 |
| Since 2017 (last 10 years) | 570 |
| Since 2007 (last 20 years) | 1377 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 110 |
| Practitioners | 107 |
| Teachers | 46 |
| Administrators | 25 |
| Policymakers | 24 |
| Counselors | 12 |
| Parents | 7 |
| Students | 7 |
| Support Staff | 4 |
| Community | 2 |
Location
| California | 61 |
| Canada | 60 |
| United States | 57 |
| Turkey | 47 |
| Australia | 43 |
| Florida | 34 |
| Germany | 26 |
| Texas | 26 |
| China | 25 |
| Netherlands | 25 |
| Iran | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
| Does not meet standards | 1 |
Tay, Louis; Vermunt, Jeroen K.; Wang, Chun – International Journal of Testing, 2013
We evaluate the item response theory with covariates (IRT-C) procedure for assessing differential item functioning (DIF) without preknowledge of anchor items (Tay, Newman, & Vermunt, 2011). This procedure begins with a fully constrained baseline model, and candidate items are tested for uniform and/or nonuniform DIF using the Wald statistic.…
Descriptors: Item Response Theory, Test Bias, Models, Statistical Analysis
Molenaar, Dylan; Borsboom, Denny – Educational Research and Evaluation, 2013
Measurement invariance is an important prerequisite for the adequate comparison of group differences in test scores. In psychology, measurement invariance is typically investigated by means of linear factor analyses of subtest scores. These subtest scores typically result from summing the item scores. In this paper, we discuss 4 possible problems…
Descriptors: Test Bias, Factor Analysis, Scores, Item Response Theory
Chalmers, R. Philip; Counsell, Alyssa; Flora, David B. – Educational and Psychological Measurement, 2016
Differential test functioning, or DTF, occurs when one or more items in a test demonstrate differential item functioning (DIF) and the aggregate of these effects are witnessed at the test level. In many applications, DTF can be more important than DIF when the overall effects of DIF at the test level can be quantified. However, optimal statistical…
Descriptors: Test Bias, Sampling, Test Items, Statistical Analysis
Oliveri, María Elena; von Davier, Alina A. – International Journal of Testing, 2016
In this study, we propose that the unique needs and characteristics of linguistic minorities should be considered throughout the test development process. Unlike most measurement invariance investigations in the assessment of linguistic minorities, which typically are conducted after test administration, we propose strategies that focus on the…
Descriptors: Psychometrics, Linguistics, Test Construction, Testing
Steedle, Jeffrey; LaSalle, Amy – Partnership for Assessment of Readiness for College and Careers, 2016
Partnership for Assessment of Readiness for College and Careers (PARCC) Operational Study 4 Component 3 was designed to compare performance on PARCC mathematics field-test items for grade 3 taken with and without a drawing tool. For the 2016 testing window, five field-test items were selected to have the directions edited to allow students to…
Descriptors: Grade 3, Mathematics Tests, Test Items, Freehand Drawing
Zumbo, Bruno D.; Liu, Yan; Wu, Amery D.; Shear, Benjamin R.; Olvera Astivia, Oscar L.; Ark, Tavinder K. – Language Assessment Quarterly, 2015
Methods for detecting differential item functioning (DIF) and item bias are typically used in the process of item analysis when developing new measures; adapting existing measures for different populations, languages, or cultures; or more generally validating test score inferences. In 2007 in "Language Assessment Quarterly," Zumbo…
Descriptors: Test Bias, Test Items, Holistic Approach, Models
Arikan, Serkan; van de Vijver, Fons J. R.; Yagmur, Kutlay – Educational Assessment, Evaluation and Accountability, 2017
Lower reading and mathematics performance of Turkish immigrant students as compared to mainstream European students could reflect differential learning outcomes, differential socioeconomic backgrounds of the groups, differential mainstream language proficiency, and/or test bias. Using PISA reading and mathematics scores of these groups, we…
Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students
Dadey, Nathan; Lyons, Susan; DePascale, Charles – Applied Measurement in Education, 2018
Evidence of comparability is generally needed whenever there are variations in the conditions of an assessment administration, including variations introduced by the administration of an assessment on multiple digital devices (e.g., tablet, laptop, desktop). This article is meant to provide a comprehensive examination of issues relevant to the…
Descriptors: Evaluation Methods, Computer Assisted Testing, Educational Technology, Technology Uses in Education
Smarter Balanced Assessment Consortium, 2018
The Smarter Balanced Assessment Consortium (Smarter Balanced) strives to provide every student with a positive and productive assessment experience, generating results that are a fair and accurate estimate of each student's achievement. Further, Smarter Balanced is building on a framework of accessibility for all students, including English…
Descriptors: Student Evaluation, Evaluation Methods, English Language Learners, Disabilities
Huang, Xiaoting; Wilson, Mark; Wang, Lei – Educational Psychology, 2016
In recent years, large-scale international assessments have been increasingly used to evaluate and compare the quality of education across regions and countries. However, measurement variance between different versions of these assessments often posts threats to the validity of such cross-cultural comparisons. In this study, we investigated the…
Descriptors: Test Bias, International Assessment, Science Tests, Test Validity
Çokluk, Ömay; Gül, Emrah; Dogan-Gül, Çilem – Educational Sciences: Theory and Practice, 2016
The study aims to examine whether differential item function is displayed in three different test forms that have item orders of random and sequential versions (easy-to-hard and hard-to-easy), based on Classical Test Theory (CTT) and Item Response Theory (IRT) methods and bearing item difficulty levels in mind. In the correlational research, the…
Descriptors: Test Bias, Test Items, Difficulty Level, Test Theory
Rutkowski, Leslie; Rutkowski, David; Zhou, Yan – International Journal of Testing, 2016
Using an empirically-based simulation study, we show that typically used methods of choosing an item calibration sample have significant impacts on achievement bias and system rankings. We examine whether recent PISA accommodations, especially for lower performing participants, can mitigate some of this bias. Our findings indicate that standard…
Descriptors: Simulation, International Programs, Adolescents, Student Evaluation
Kim, Dong-gook; Helms, Marilyn M. – Journal of Education for Business, 2016
Assurance of learning (AoL) processes for continuous improvement and accreditation require business schools to assess program goals. Findings from the process can lead to changes in course design or curriculum. Often AoL assignments are embedded into existing courses and assessed at regular intervals. Faculty members may evaluate an assignment in…
Descriptors: Course Evaluation, Business Administration Education, Business Schools, College Faculty
Dwyer, Andrew C. – Journal of Educational Measurement, 2016
This study examines the effectiveness of three approaches for maintaining equivalent performance standards across test forms with small samples: (1) common-item equating, (2) resetting the standard, and (3) rescaling the standard. Rescaling the standard (i.e., applying common-item equating methodology to standard setting ratings to account for…
Descriptors: Cutting Scores, Equivalency Tests, Test Format, Academic Standards
Tijdens, Kea; Steinmetz, Stephanie – International Journal of Social Research Methodology, 2016
Whereas the sample composition biases of web surveys have been discussed extensively for developed countries, studies for developing countries are scarce. This article helps to fill that gap by comparing similar non-probability-based web surveys (WEB) and probability-based face-to-face (F2F) surveys both to each other and to the labor force. An…
Descriptors: Online Surveys, Probability, Surveys, Test Bias

Peer reviewed
Direct link
