NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 3,707 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Haeju Lee; Kyung Yong Kim – Journal of Educational Measurement, 2025
When no prior information of differential item functioning (DIF) exists for items in a test, either the rank-based or iterative purification procedure might be preferred. The rank-based purification selects anchor items based on a preliminary DIF test. For a preliminary DIF test, likelihood ratio test (LRT) based approaches (e.g.,…
Descriptors: Test Items, Equated Scores, Test Bias, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Sara E. Witmer; Nathalie L. Marinho – Educational Assessment, Evaluation and Accountability, 2025
When large-scale assessment programs are developed and administered in a particular language, students from other native language backgrounds may experience considerable barriers to appropriate measurement of the targeted knowledge and skills. Empirical work is needed to determine if one of the most commonly-applied accommodations to address…
Descriptors: Testing Accommodations, English Learners, National Competency Tests, Time
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Jiban Khadka; Dirgha Raj Joshi; Krishna Prasad Adhikari; Bishnu Khanal – Journal of Educators Online, 2025
This study aims to explore the impact of the fairness of semester-end e-assessment in terms of policy provision, monitoring, and authenticity. The cross-sectional online survey design was employed among 346 students at Nepal Open University (NOU). The results were analyzed by using t-test, analysis of variance, and structural equation modeling.…
Descriptors: Foreign Countries, College Students, Open Universities, Computer Assisted Testing
Zeyuan Jing – ProQuest LLC, 2023
This dissertation presents a comprehensive review of the evolution of DIF analysis within educational measurement from the 1980s to the present. The review elucidates the concept of DIF, particularly emphasizing the crucial role of grouping for exhibiting DIF. Then, the dissertation introduces an innovative modification to the newly developed…
Descriptors: Item Response Theory, Algorithms, Measurement, Test Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Sanford R. Student – Grantee Submission, 2025
Vertical scales are intended to establish a common metric for scores on test forms targeting different levels of development in a specified domain. They are often constructed using common item, nonequivalent group designs that implicitly rely on the linking items being effectively free from differential item functioning (DIF) or the DIF being…
Descriptors: Scaling, Factor Analysis, Test Bias, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2024
Rapid guessing (RG) is a form of non-effortful responding that is characterized by short response latencies. This construct-irrelevant behavior has been shown in previous research to bias inferences concerning measurement properties and scores. To mitigate these deleterious effects, a number of response time threshold scoring procedures have been…
Descriptors: Reaction Time, Scores, Item Response Theory, Guessing (Tests)
Peer reviewed Peer reviewed
Direct linkDirect link
Leala Holcomb; Wyatte C. Hall; Stephanie J. Gardiner-Walsh; Jessica Scott – Journal of Deaf Studies and Deaf Education, 2025
This study critically examines the biases and methodological shortcomings in studies comparing deaf and hearing populations, demonstrating their implications for both the reliability and ethics of research in deaf education. Upon reviewing the 20 most-cited deaf-hearing comparison studies, we identified recurring fallacies such as the presumption…
Descriptors: Literature Reviews, Deafness, Social Bias, Test Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Ikkyu Choi; Jiyun Zu – Language Testing, 2025
Today's language models can produce syntactically accurate and semantically coherent texts. This capability presents new opportunities for generating content for language assessments, which have traditionally required intensive expert resources. However, these models are also known to generate biased texts, leading to representational harms.…
Descriptors: Artificial Intelligence, Language Tests, Test Bias, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Agustín Barroilhet; Mónica Silva; Kurt F. Geisinger – Higher Education Policy, 2025
Merit-based procedures should be constantly reevaluated according to the circumstances to remain both valid and fair--two interrelated concepts. Inducing reevaluation, however, is difficult. These procedures are controlled by legitimate authorities, are rule and contract-bound, and can become quickly entrenched. This resistance to change calls for…
Descriptors: Foreign Countries, Student Evaluation, Student Rights, Justice
Peer reviewed Peer reviewed
Direct linkDirect link
Stefanie A. Wind; Yuan Ge – Measurement: Interdisciplinary Research and Perspectives, 2024
Mixed-format assessments made up of multiple-choice (MC) items and constructed response (CR) items that are scored using rater judgments include unique psychometric considerations. When these item types are combined to estimate examinee achievement, information about the psychometric quality of each component can depend on that of the other. For…
Descriptors: Interrater Reliability, Test Bias, Multiple Choice Tests, Responses
Peer reviewed Peer reviewed
Direct linkDirect link
William C. M. Belzak; Daniel J. Bauer – Journal of Educational and Behavioral Statistics, 2024
Testing for differential item functioning (DIF) has undergone rapid statistical developments recently. Moderated nonlinear factor analysis (MNLFA) allows for simultaneous testing of DIF among multiple categorical and continuous covariates (e.g., sex, age, ethnicity, etc.), and regularization has shown promising results for identifying DIF among…
Descriptors: Test Bias, Algorithms, Factor Analysis, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Sooyong Lee; Suhwa Han; Seung W. Choi – Journal of Educational Measurement, 2024
Research has shown that multiple-indicator multiple-cause (MIMIC) models can result in inflated Type I error rates in detecting differential item functioning (DIF) when the assumption of equal latent variance is violated. This study explains how the violation of the equal variance assumption adversely impacts the detection of nonuniform DIF and…
Descriptors: Factor Analysis, Bayesian Statistics, Test Bias, Item Response Theory
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Farida Agus Setiawati; Tria Widyastuti; Kartika Nur Fathiyah; Tiara Shafa Nabila – European Journal of Psychology and Educational Research, 2024
Data obtained through questionnaires sometimes respond to the items presented by social norms, so sometimes they do not suit themselves. High social desirability (SD) in non-cognitive measurements will cause item bias. Several ways are used to reduce item bias, including freeing respondents from not writing their names or being anonymous,…
Descriptors: Social Desirability, Test Bias, Self Concept, Undergraduate Students
Peer reviewed Peer reviewed
Direct linkDirect link
Belzak, William C. M. – Educational Measurement: Issues and Practice, 2023
Test developers and psychometricians have historically examined measurement bias and differential item functioning (DIF) across a single categorical variable (e.g., gender), independently of other variables (e.g., race, age, etc.). This is problematic when more complex forms of measurement bias may adversely affect test responses and, ultimately,…
Descriptors: Test Bias, High Stakes Tests, Artificial Intelligence, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Jung Yeon Park; Sean Joo; Zikun Li; Hyejin Yoon – Educational Measurement: Issues and Practice, 2025
This study examines potential assessment bias based on students' primary language status in PISA 2018. Specifically, multilingual (MLs) and nonmultilingual (non-MLs) students in the United States are compared with regard to their response time as well as scored responses across three cognitive domains (reading, mathematics, and science).…
Descriptors: Achievement Tests, Secondary School Students, International Assessment, Test Bias
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  248