Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 6 |
Since 2016 (last 10 years) | 10 |
Descriptor
Test Bias | 10 |
Testing Problems | 10 |
Test Items | 4 |
Foreign Countries | 3 |
High Stakes Tests | 3 |
Algebra | 2 |
Artificial Intelligence | 2 |
Elementary Secondary Education | 2 |
Equated Scores | 2 |
Ethics | 2 |
Individualized Instruction | 2 |
More ▼ |
Source
Educational Measurement:… | 2 |
Applied Measurement in… | 1 |
Assessment in Education:… | 1 |
Grantee Submission | 1 |
Interchange: A Quarterly… | 1 |
Journal of Experimental… | 1 |
Language Testing | 1 |
Measurement:… | 1 |
Online Submission | 1 |
Author
Anne Corinne Huggins-Manley | 2 |
Daniel Katz | 2 |
Walter Leite | 2 |
Allison Ames | 1 |
Angela Johnson | 1 |
Baird, Jo-Anne | 1 |
Brandon Crawford | 1 |
El Masri, Yasmine H. | 1 |
Elizabeth Barker | 1 |
Graesser, Art | 1 |
James D. Weese | 1 |
More ▼ |
Publication Type
Journal Articles | 9 |
Reports - Research | 6 |
Reports - Evaluative | 3 |
Reports - Descriptive | 1 |
Education Level
Elementary Secondary Education | 3 |
Higher Education | 1 |
Postsecondary Education | 1 |
Secondary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 1 |
What Works Clearinghouse Rating
James D. Weese; Ronna C. Turner; Allison Ames; Xinya Liang; Brandon Crawford – Journal of Experimental Education, 2024
In this study a standardized effect size was created for use with the SIBTEST procedure. Using this standardized effect size, a single set of heuristics was developed that are appropriate for data fitting different item response models (e.g., 2-parameter logistic, 3-parameter logistic). The standardized effect size rescales the raw beta-uni value…
Descriptors: Test Bias, Test Items, Item Response Theory, Effect Size
Kim, Sooyeon; Walker, Michael E. – Educational Measurement: Issues and Practice, 2022
Test equating requires collecting data to link the scores from different forms of a test. Problems arise when equating samples are not equivalent and the test forms to be linked share no common items by which to measure or adjust for the group nonequivalence. Using data from five operational test forms, we created five pairs of research forms for…
Descriptors: Ability, Tests, Equated Scores, Testing Problems
Angela Johnson; Elizabeth Barker; Marcos Viveros Cespedes – Educational Measurement: Issues and Practice, 2024
Educators and researchers strive to build policies and practices on data and evidence, especially on academic achievement scores. When assessment scores are inaccurate for specific student populations or when scores are inappropriately used, even data-driven decisions will be misinformed. To maximize the impact of the research-practice-policy…
Descriptors: Equal Education, Inclusion, Evaluation Methods, Error of Measurement
Salmani Nodoushan, Mohammad Ali – Online Submission, 2021
This paper follows a line of logical argumentation to claim that what Samuel Messick conceptualized about construct validation has probably been misunderstood by some educational policy makers, practicing educators, and classroom teachers. It argues that, while Messick's unified theory of test validation aimed at (a) warning educational…
Descriptors: Construct Validity, Test Theory, Test Use, Affordances
Daniel Katz; Anne Corinne Huggins-Manley; Walter Leite – Grantee Submission, 2022
According to the Standards for Educational and Psychological Testing (2014), one aspect of test fairness concerns examinees having comparable opportunities to learn prior to taking tests. Meanwhile, many researchers are developing platforms enhanced by artificial intelligence (AI) that can personalize curriculum to individual student needs. This…
Descriptors: High Stakes Tests, Test Bias, Testing Problems, Prior Learning
Daniel Katz; Anne Corinne Huggins-Manley; Walter Leite – Applied Measurement in Education, 2022
According to the "Standards for Educational and Psychological Testing" (2014), one aspect of test fairness concerns examinees having comparable opportunities to learn prior to taking tests. Meanwhile, many researchers are developing platforms enhanced by artificial intelligence (AI) that can personalize curriculum to individual student…
Descriptors: High Stakes Tests, Test Bias, Testing Problems, Prior Learning
Luo, Yong; Liang, Xinya – Measurement: Interdisciplinary Research and Perspectives, 2019
Current methods that simultaneously model differential testlet functioning (DTLF) and differential item functioning (DIF) constrain the variances of latent ability and testlet effects to be equal between the focal and the reference groups. Such a constraint can be stringent and unrealistic with real data. In this study, we propose a multigroup…
Descriptors: Test Items, Item Response Theory, Test Bias, Models
Zhao, Cecilia Guanfang; Liu, Carina Jiayu – Language Testing, 2019
Celpe-Bras, is the exam for the certification of proficiency in Portuguese as a foreign language. It, is the only Portuguese proficiency test recognized by the Brazilian government (Ministério da Educação, 2013). Given the recent growth of interest and also its unique design as a large-scale proficiency test, this article provides a general…
Descriptors: Portuguese, Second Language Learning, Language Proficiency, Language Tests
Safari, Parvin – Interchange: A Quarterly Review of Education, 2016
Recently, there has been a change from traditional language testing approaches, with a focus on psychometric properties towards critical language testing (CLT) with its social practice nature. CLT assumes tests not as neutral devices but as instruments of power and control which are related to authorities' policy agendas to shape individuals' and…
Descriptors: Foreign Countries, Language Tests, Educational Practices, High Stakes Tests
El Masri, Yasmine H.; Baird, Jo-Anne; Graesser, Art – Assessment in Education: Principles, Policy & Practice, 2016
We investigate the extent to which language versions (English, French and Arabic) of the same science test are comparable in terms of item difficulty and demands. We argue that language is an inextricable part of the scientific literacy construct, be it intended or not by the examiner. This argument has considerable implications on methodologies…
Descriptors: International Assessment, Difficulty Level, Test Items, Language Variation