Publication Date
In 2025 | 16 |
Since 2024 | 97 |
Since 2021 (last 5 years) | 273 |
Since 2016 (last 10 years) | 617 |
Since 2006 (last 20 years) | 1413 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Researchers | 110 |
Practitioners | 107 |
Teachers | 46 |
Administrators | 25 |
Policymakers | 24 |
Counselors | 12 |
Parents | 7 |
Students | 7 |
Support Staff | 4 |
Community | 2 |
Location
California | 60 |
Canada | 60 |
United States | 56 |
Turkey | 47 |
Australia | 43 |
Florida | 34 |
Germany | 26 |
Texas | 26 |
Netherlands | 25 |
China | 24 |
Iran | 21 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |
Does not meet standards | 1 |
Dimitrov, Dimiter M.; Atanasov, Dimitar V. – Educational and Psychological Measurement, 2022
This study offers an approach to testing for differential item functioning (DIF) in a recently developed measurement framework, referred to as "D"-scoring method (DSM). Under the proposed approach, called "P-Z" method of testing for DIF, the item response functions of two groups (reference and focal) are compared by…
Descriptors: Test Bias, Methods, Test Items, Scoring
Randall, Jennifer – Educational Assessment, 2023
In a justice-oriented antiracist assessment process, attention to the disruption of white supremacy must occur at every stage--from construct articulation to score reporting. An important step in the assessment development process is the item review stage often referred to as Bias/Fairness and Sensitivity Review. I argue that typical approaches to…
Descriptors: Social Justice, Racism, Test Bias, Test Items
James D. Weese; Ronna C. Turner; Allison Ames; Xinya Liang; Brandon Crawford – Journal of Experimental Education, 2024
In this study a standardized effect size was created for use with the SIBTEST procedure. Using this standardized effect size, a single set of heuristics was developed that are appropriate for data fitting different item response models (e.g., 2-parameter logistic, 3-parameter logistic). The standardized effect size rescales the raw beta-uni value…
Descriptors: Test Bias, Test Items, Item Response Theory, Effect Size
Gregory J. Crowther; Benjamin L. Wiggins – Journal of Microbiology & Biology Education, 2024
Students in STEM know well the stress, challenge, and effort that accompany college exams. As a widely recognizable feature of the STEM classroom experience, high-stakes assessments serve as crucial cultural gateways in shaping both preparation and motivation for careers. In this essay, we identify and discuss issues of power around STEM exams to…
Descriptors: STEM Education, High Stakes Tests, Test Bias, Power Structure
Christin Rickman – ProQuest LLC, 2024
This dissertation examines the landmark case Larry P. v. Riles and its impact on addressing the disproportionality and overrepresentation of Black and/or African American students in special education within California. Despite the court's ruling, which prohibited the use of IQ tests for Black students for special education placement due to…
Descriptors: Special Education, African American Students, Racial Discrimination, Alternative Assessment
Finch, W. Holmes – Educational and Psychological Measurement, 2023
Psychometricians have devoted much research and attention to categorical item responses, leading to the development and widespread use of item response theory for the estimation of model parameters and identification of items that do not perform in the same way for examinees from different population subgroups (e.g., differential item functioning…
Descriptors: Test Bias, Item Response Theory, Computation, Methods
Chalmers, R. Philip – Journal of Educational Measurement, 2023
Several marginal effect size (ES) statistics suitable for quantifying the magnitude of differential item functioning (DIF) have been proposed in the area of item response theory; for instance, the Differential Functioning of Items and Tests (DFIT) statistics, signed and unsigned item difference in the sample statistics (SIDS, UIDS, NSIDS, and…
Descriptors: Test Bias, Item Response Theory, Definitions, Monte Carlo Methods
Veronica Y. Kang; Sunyoung Kim; Emily V. Gregori; Daniel M. Maggin; Jason C. Chow; Hongyang Zhao – Journal of Speech, Language, and Hearing Research, 2025
Purpose: Early language intervention is essential for children with indicators of language delay. Enhanced milieu teaching (EMT) is a naturalistic intervention that supports the language development of children with emerging language. We conducted a systematic review and meta-analysis of all qualifying single-case and group design studies that…
Descriptors: Literature Reviews, Meta Analysis, Early Intervention, Response to Intervention
Martijn Schoenmakers; Jesper Tijmstra; Jeroen Vermunt; Maria Bolsinova – Educational and Psychological Measurement, 2024
Extreme response style (ERS), the tendency of participants to select extreme item categories regardless of the item content, has frequently been found to decrease the validity of Likert-type questionnaire results. For this reason, various item response theory (IRT) models have been proposed to model ERS and correct for it. Comparisons of these…
Descriptors: Item Response Theory, Response Style (Tests), Models, Likert Scales
Menold, Natalja – Field Methods, 2023
While numerical bipolar rating scales may evoke positivity bias, little is known about the corresponding bias in verbal bipolar rating scales. The choice of verbalization of the middle category may lead to response bias, particularly if it is not in line with the scale polarity. Unipolar and bipolar seven-category rating scales in which the…
Descriptors: Rating Scales, Test Bias, Verbal Tests, Responses
Li, Hongli; Hunter, Charles Vincent; Bialo, Jacquelyn Anne – Language Assessment Quarterly, 2022
The purpose of this study is to review the status of differential item functioning (DIF) research in language testing, particularly as it relates to the investigation of sources (or causes) of DIF, which is a defining characteristic of the third generation DIF. This review included 110 DIF studies of language tests dated from 1985 to 2019. We…
Descriptors: Test Bias, Language Tests, Statistical Analysis, Evaluation Research
Butucescu, Andreea; Iliescu, Drago? – Educational Studies, 2022
The current study examines the perceived fairness of an educational assessment process, considering the influence of positive and negative affect. The first objective was to determine if a person's evaluation of fairness fluctuates depending on the incidental affect (pre-evaluation affect). The second objective was studying the connection between…
Descriptors: Emotional Response, Test Bias, Testing, Evaluation
Hung-Yu Huang – Educational and Psychological Measurement, 2025
The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent…
Descriptors: Response Style (Tests), Psychological Characteristics, Item Response Theory, Test Reliability
Alexander Robitzsch; Oliver Lüdtke – Measurement: Interdisciplinary Research and Perspectives, 2024
Educational large-scale assessment (LSA) studies like the program for international student assessment (PISA) provide important information about trends in the performance of educational indicators in cognitive domains. The change in the country means in a cognitive domain like reading between two successive assessments is an example of a trend…
Descriptors: Secondary School Students, Foreign Countries, International Assessment, Achievement Tests
Andrew D. Ho – Journal of Educational and Behavioral Statistics, 2024
I review opportunities and threats that widely accessible Artificial Intelligence (AI)-powered services present for educational statistics and measurement. Algorithmic and computational advances continue to improve approaches to item generation, scale maintenance, test security, test scoring, and score reporting. Predictable misuses of AI for…
Descriptors: Artificial Intelligence, Measurement, Educational Assessment, Technology Uses in Education