Publication Date
| In 2026 | 0 |
| Since 2025 | 34 |
| Since 2022 (last 5 years) | 221 |
| Since 2017 (last 10 years) | 566 |
| Since 2007 (last 20 years) | 1373 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 110 |
| Practitioners | 107 |
| Teachers | 46 |
| Administrators | 25 |
| Policymakers | 24 |
| Counselors | 12 |
| Parents | 7 |
| Students | 7 |
| Support Staff | 4 |
| Community | 2 |
Location
| California | 61 |
| Canada | 60 |
| United States | 57 |
| Turkey | 47 |
| Australia | 43 |
| Florida | 34 |
| Germany | 26 |
| Texas | 26 |
| China | 25 |
| Netherlands | 25 |
| Iran | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
| Does not meet standards | 1 |
Menold, Natalja – Field Methods, 2023
While numerical bipolar rating scales may evoke positivity bias, little is known about the corresponding bias in verbal bipolar rating scales. The choice of verbalization of the middle category may lead to response bias, particularly if it is not in line with the scale polarity. Unipolar and bipolar seven-category rating scales in which the…
Descriptors: Rating Scales, Test Bias, Verbal Tests, Responses
Li, Hongli; Hunter, Charles Vincent; Bialo, Jacquelyn Anne – Language Assessment Quarterly, 2022
The purpose of this study is to review the status of differential item functioning (DIF) research in language testing, particularly as it relates to the investigation of sources (or causes) of DIF, which is a defining characteristic of the third generation DIF. This review included 110 DIF studies of language tests dated from 1985 to 2019. We…
Descriptors: Test Bias, Language Tests, Statistical Analysis, Evaluation Research
Butucescu, Andreea; Iliescu, Drago? – Educational Studies, 2022
The current study examines the perceived fairness of an educational assessment process, considering the influence of positive and negative affect. The first objective was to determine if a person's evaluation of fairness fluctuates depending on the incidental affect (pre-evaluation affect). The second objective was studying the connection between…
Descriptors: Emotional Response, Test Bias, Testing, Evaluation
Hung-Yu Huang – Educational and Psychological Measurement, 2025
The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent…
Descriptors: Response Style (Tests), Psychological Characteristics, Item Response Theory, Test Reliability
Novina Sabila Zahra; Hillman Wirawan – Measurement: Interdisciplinary Research and Perspectives, 2025
Technology development has triggered digital transformation in various organizations, influencing work processes, communication, and innovation. Digital leadership plays a crucial role in directing and managing this transformation. This research aims to develop a new measurement tool for assessing digital leadership using the Rasch Model for…
Descriptors: Leadership, Measures (Individuals), Test Validity, Item Response Theory
Akif Avcu – International Journal of Psychology and Educational Studies, 2025
This review explores the significant contributions of Rasch modeling in enhancing classroom assessment practices, particularly in measuring student attitudes. Classroom assessment has evolved from standardized testing to integrative practices that emphasize both academic and affective dimensions of student development. Accurate attitude…
Descriptors: Item Response Theory, Student Attitudes, Student Evaluation, Attitude Measures
Alexander Robitzsch; Oliver Lüdtke – Measurement: Interdisciplinary Research and Perspectives, 2024
Educational large-scale assessment (LSA) studies like the program for international student assessment (PISA) provide important information about trends in the performance of educational indicators in cognitive domains. The change in the country means in a cognitive domain like reading between two successive assessments is an example of a trend…
Descriptors: Secondary School Students, Foreign Countries, International Assessment, Achievement Tests
Andrew D. Ho – Journal of Educational and Behavioral Statistics, 2024
I review opportunities and threats that widely accessible Artificial Intelligence (AI)-powered services present for educational statistics and measurement. Algorithmic and computational advances continue to improve approaches to item generation, scale maintenance, test security, test scoring, and score reporting. Predictable misuses of AI for…
Descriptors: Artificial Intelligence, Measurement, Educational Assessment, Technology Uses in Education
Amirhossein Rasooli; Christopher DeLuca – Applied Measurement in Education, 2024
Inspired by the recent 21st century social and educational movements toward equity, diversity, and inclusion for disadvantaged groups, educational researchers have sought in conceptualizing fairness in classroom assessment contexts. These efforts have provoked promising key theoretical foundations and empirical investigations to examine fairness…
Descriptors: Test Bias, Student Evaluation, Social Justice, Equal Education
Weese, James D.; Turner, Ronna C.; Liang, Xinya; Ames, Allison; Crawford, Brandon – Educational and Psychological Measurement, 2023
A study was conducted to implement the use of a standardized effect size and corresponding classification guidelines for polytomous data with the POLYSIBTEST procedure and compare those guidelines with prior recommendations. Two simulation studies were included. The first identifies new unstandardized test heuristics for classifying moderate and…
Descriptors: Effect Size, Classification, Guidelines, Statistical Analysis
Ebru Dogruöz; Hülya Kelecioglu – International Journal of Assessment Tools in Education, 2024
In this research, multistage adaptive tests (MST) were compared according to sample size, panel pattern and module length for top-down and bottom-up test assembly methods. Within the scope of the research, data from PISA 2015 were used and simulation studies were conducted according to the parameters estimated from these data. Analysis results for…
Descriptors: Adaptive Testing, Test Construction, Foreign Countries, Achievement Tests
ETS Research Institute, 2024
ETS experts are exploring and defining the standards for responsible AI use in assessments. A comprehensive framework and principles will be unveiled in the coming months. In the meantime, this document outlines the critical areas these standards will encompass, including the principles of: (1) Fairness and bias mitigation; (2) Privacy and…
Descriptors: Artificial Intelligence, Computer Assisted Testing, Educational Testing, Ethics
Corinne Huggins-Manley; Anthony W. Raborn; Peggy K. Jones; Ted Myers – Journal of Educational Measurement, 2024
The purpose of this study is to develop a nonparametric DIF method that (a) compares focal groups directly to the composite group that will be used to develop the reported test score scale, and (b) allows practitioners to explore for DIF related to focal groups stemming from multicategorical variables that constitute a small proportion of the…
Descriptors: Nonparametric Statistics, Test Bias, Scores, Statistical Significance
Jacklin H. Stonewall; Michael C. Dorneich; Jane Rongerude – Assessment & Evaluation in Higher Education, 2024
Peer assessment training was motivated, developed and evaluated to address fairness in higher education group learning. Team-centric pedagogies, such as team-based learning have been shown to improve engagement and learning outcomes. For many instructors using teams, peer assessments are integral for monitoring team performance and ensuring…
Descriptors: Peer Evaluation, Training, Student Attitudes, Program Effectiveness
Paula Elosua – Language Assessment Quarterly, 2024
In sociolinguistic contexts where standardized languages coexist with regional dialects, the study of differential item functioning is a valuable tool for examining certain linguistic uses or varieties as threats to score validity. From an ecological perspective, this paper describes three stages in the study of differential item functioning…
Descriptors: Reading Tests, Reading Comprehension, Scores, Test Validity

Peer reviewed
Direct link
