Publication Date
In 2025 | 16 |
Since 2024 | 97 |
Since 2021 (last 5 years) | 273 |
Since 2016 (last 10 years) | 617 |
Since 2006 (last 20 years) | 1413 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Researchers | 110 |
Practitioners | 107 |
Teachers | 46 |
Administrators | 25 |
Policymakers | 24 |
Counselors | 12 |
Parents | 7 |
Students | 7 |
Support Staff | 4 |
Community | 2 |
Location
California | 60 |
Canada | 60 |
United States | 56 |
Turkey | 47 |
Australia | 43 |
Florida | 34 |
Germany | 26 |
Texas | 26 |
Netherlands | 25 |
China | 24 |
Iran | 21 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |
Does not meet standards | 1 |
Ramsay, James; Wiberg, Marie; Li, Juan – Journal of Educational and Behavioral Statistics, 2020
Ramsay and Wiberg used a new version of item response theory that represents test performance over nonnegative closed intervals such as [0, 100] or [0, n] and demonstrated that optimal scoring of binary test data yielded substantial improvements in point-wise root-mean-squared error and bias over number right or sum scoring. We extend these…
Descriptors: Scoring, Weighted Scores, Item Response Theory, Intervals
Liou, Gloria; Bonner, Cavan V.; Tay, Louis – International Journal of Testing, 2022
With the advent of big data and advances in technology, psychological assessments have become increasingly sophisticated and complex. Nevertheless, traditional psychometric issues concerning the validity, reliability, and measurement bias of such assessments remain fundamental in determining whether score inferences of human attributes are…
Descriptors: Psychometrics, Computer Assisted Testing, Adaptive Testing, Data
Sachin Nedungadi; Corina E. Brown; Sue Hyeon Paek – Journal of Chemical Education, 2022
The Fundamental Concepts for Organic Reaction Mechanisms Inventory (FC-ORMI) is a concept inventory with most items in a two-tier design in which an answer tier is followed by a reasoning tier. Statistical results provided strong evidence for the validity and reliability of the data obtained using the FC-ORMI. In this study, differential item…
Descriptors: Test Bias, Test Validity, Test Reliability, Gender Differences
Ahmad Suryadi; Sahal Fawaiz; Eka Kurniati; Ahmad Swandi – Journal of Pedagogical Research, 2024
The waning interest of students in science became a global concern. The purpose of this research was to translate, adapt, and validate the My Attitude toward Science [MATS] questionnaire instrument, which was used to measure students' attitudes toward science in the Indonesian context. We also investigated the items that contributed to gender and…
Descriptors: Foreign Countries, Science Education, Achievement Tests, Secondary School Students
Christopher D. Wilson; Kevin C. Haudek; Jonathan F. Osborne; Zoë E. Buck Bracey; Tina Cheuk; Brian M. Donovan; Molly A. M. Stuhlsatz; Marisol M. Santiago; Xiaoming Zhai – Journal of Research in Science Teaching, 2024
Argumentation is fundamental to science education, both as a prominent feature of scientific reasoning and as an effective mode of learning--a perspective reflected in contemporary frameworks and standards. The successful implementation of argumentation in school science, however, requires a paradigm shift in science assessment from the…
Descriptors: Middle School Students, Competence, Science Process Skills, Persuasive Discourse
J. A. Bialo; H. Li – Educational Assessment, 2024
This study evaluated differential item functioning (DIF) in achievement motivation items before and after using anchoring vignettes as a statistical tool to account for group differences in response styles across gender and ethnicity. We applied the nonparametric scoring of the vignettes to motivation items from the 2015 Programme for…
Descriptors: Test Bias, Student Motivation, Achievement Tests, Secondary School Students
Kartianom Kartianom; Heri Retnawati; Kana Hidayati – Journal of Pedagogical Research, 2024
Conducting a fair test is important for educational research. Unfair assessments can lead to gender disparities in academic achievement, ultimately resulting in disparities in opportunities, wages, and career choice. Differential Item Function [DIF] analysis is presented to provide evidence of whether the test is truly fair, where it does not harm…
Descriptors: Foreign Countries, Test Bias, Item Response Theory, Test Theory
Esteban Guevara Hidalgo – International Journal for Educational Integrity, 2025
The COVID-19 pandemic had a profound impact on education, forcing many teachers and students who were not used to online education to adapt to an unanticipated reality by improvising new teaching and learning methods. Within the realm of virtual education, the evaluation methods underwent a transformation, with some assessments shifting towards…
Descriptors: Foreign Countries, Higher Education, COVID-19, Pandemics
Valentine, Nyoli; Durning, Steven; Shanahan, Ernst Michael; Schuwirth, Lambert – Advances in Health Sciences Education, 2021
Human judgement is widely used in workplace-based assessment despite criticism that it does not meet standards of objectivity. There is an ongoing push within the literature to better embrace subjective human judgement in assessment not as a 'problem' to be corrected psychometrically but as legitimate perceptions of performance. Taking a step back…
Descriptors: Justice, Literature Reviews, Evaluation Methods, Test Bias
Vo, Thao T.; French, Brian F. – Educational Measurement: Issues and Practice, 2021
The use and interpretation of educational and psychological test scores are paramount to individual outcomes and opportunities. Methods for detecting differential item functioning (DIF) are imperative for item analysis when developing and revising assessments, particularly as it pertains to fairness across populations, languages, and cultures. We…
Descriptors: Risk Assessment, Needs Assessment, Test Bias, Youth
Wessels, Marleen D.; Paap, Muirne C. S.; Van der Putten, Annette A. J. – Journal of Intellectual & Developmental Disability, 2021
Background: Research about the psychometric properties of the Behavioural Appraisal Scales (BAS) in people with profound intellectual and multiple disabilities (PIMD) is limited. This study evaluates invariance in factor structure, item bias and convergent validity of the BAS. Methods: Data on the BAS from two studies (n = 25; n = 52) were…
Descriptors: Test Validity, Ability Identification, Severe Intellectual Disability, Multiple Disabilities
Child, Simon; Ellis, Paul – SAGE Publications Ltd (UK), 2021
How do teachers develop their understanding of the foundation principles of assessment, stay up to date with the latest classroom approaches and have the confidence to evaluate and question the effectiveness of new methods? This professional resource for teachers supports them to understand the what, why and how of assessment. It provides key…
Descriptors: Assessment Literacy, Student Evaluation, Evaluation Methods, Self Efficacy
Brennan, Robert L.; Kim, Stella Y.; Lee, Won-Chan – Educational and Psychological Measurement, 2022
This article extends multivariate generalizability theory (MGT) to tests with different random-effects designs for each level of a fixed facet. There are numerous situations in which the design of a test and the resulting data structure are not definable by a single design. One example is mixed-format tests that are composed of multiple-choice and…
Descriptors: Multivariate Analysis, Generalizability Theory, Multiple Choice Tests, Test Construction
Barger, Brian; Graybill, Emily; Roach, Andrew; Lane, Kathleen – Assessment for Effective Intervention, 2022
This study used item response theory (IRT) methods to investigate group differences in responses to the 12-item Student Risk Screening Scale--Internalizing and Externalizing (SRSS-IE12) in a sample of 3,837 U.S. elementary school students. Using factor analysis and graded response models from IRT methods, we examined the factor structure and the…
Descriptors: Test Bias, Item Response Theory, Screening Tests, Elementary School Students
Robitzsch, Alexander; Lüdtke, Oliver – Journal of Educational and Behavioral Statistics, 2022
One of the primary goals of international large-scale assessments in education is the comparison of country means in student achievement. This article introduces a framework for discussing differential item functioning (DIF) for such mean comparisons. We compare three different linking methods: concurrent scaling based on full invariance,…
Descriptors: Test Bias, International Assessment, Scaling, Comparative Analysis