Publication Date
In 2025 | 0 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 6 |
Since 2016 (last 10 years) | 17 |
Since 2006 (last 20 years) | 40 |
Descriptor
Test Bias | 43 |
Statistical Analysis | 21 |
Test Items | 17 |
Scores | 14 |
Item Response Theory | 12 |
Comparative Analysis | 11 |
Correlation | 10 |
Test Validity | 10 |
College Entrance Examinations | 9 |
Test Reliability | 9 |
Psychometrics | 8 |
More ▼ |
Source
ETS Research Report Series | 43 |
Author
Dorans, Neil J. | 8 |
Liu, Ou Lydia | 3 |
Sinharay, Sandip | 3 |
Fu, Jianbin | 2 |
Guo, Hongwen | 2 |
Kim, Sooyeon | 2 |
Middleton, Kyndra | 2 |
Olivera-Aguilar, Margarita | 2 |
Puhan, Gautam | 2 |
Steinberg, Jonathan | 2 |
Wang, Zhen | 2 |
More ▼ |
Publication Type
Journal Articles | 43 |
Reports - Research | 40 |
Numerical/Quantitative Data | 2 |
Reports - Evaluative | 2 |
Tests/Questionnaires | 2 |
Reports - Descriptive | 1 |
Education Level
Higher Education | 11 |
Postsecondary Education | 10 |
Secondary Education | 6 |
Elementary Education | 5 |
Grade 8 | 4 |
High Schools | 4 |
Intermediate Grades | 3 |
Junior High Schools | 3 |
Middle Schools | 3 |
Grade 4 | 2 |
Grade 7 | 2 |
More ▼ |
Audience
Location
United States | 2 |
China | 1 |
Delaware | 1 |
Illinois | 1 |
Maryland | 1 |
North Carolina | 1 |
Ohio | 1 |
Oregon | 1 |
Pennsylvania | 1 |
Washington | 1 |
Laws, Policies, & Programs
Rehabilitation Act 1973… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Michael E. Walker; Margarita Olivera-Aguilar; Blair Lehman; Cara Laitusis; Danielle Guzman-Orth; Melissa Gholson – ETS Research Report Series, 2023
Recent criticisms of large-scale summative assessments have claimed that the assessments are biased against historically excluded groups because of the assessments' lack of cultural representation. Accompanying these criticisms is a call for more culturally responsive assessments--assessments that take into account the background characteristics…
Descriptors: Culturally Relevant Education, Measurement, Summative Evaluation, Student Evaluation
Kim, Sooyeon; Walker, Michael E. – ETS Research Report Series, 2021
Equating the scores from different forms of a test requires collecting data that link the forms. Problems arise when the test forms to be linked are given to groups that are not equivalent and the forms share no common items by which to measure or adjust for this group nonequivalence. We compared three approaches to adjusting for group…
Descriptors: Equated Scores, Weighted Scores, Sampling, Multiple Choice Tests
Jonathan Schmidgall; Yan Huo; Jaime Cid; Youhua Wei – ETS Research Report Series, 2024
The principle of fairness in testing traditionally involves an assertion about the absence of bias, or that measurement should be impartial (i.e., not provide an unfair advantage or disadvantage), across groups of test takers. In more general-purposes language testing, a test taker's background knowledge is not typically considered relevant to the…
Descriptors: Testing, Language Tests, Test Bias, English for Special Purposes
Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2019
We derive formulas for the differential item functioning (DIF) measures that two routinely used DIF statistics are designed to estimate. The DIF measures that match on observed scores are compared to DIF measures based on an unobserved ability (theta or true score) for items that are described by either the one-parameter logistic (1PL) or…
Descriptors: Scores, Test Bias, Statistical Analysis, Item Response Theory
Steven Holtzman; Jonathan Steinberg; Jonathan Weeks; Christopher Robertson; Jessica Findley; David Klieger – ETS Research Report Series, 2024
At a time when institutions of higher education are exploring alternatives to traditional admissions testing, institutions are also seeking to better support students and prepare them for academic success. Under such an engaged model, one may seek to measure not just the accumulated knowledge and skills that students would bring to a new academic…
Descriptors: Law Schools, College Applicants, Legal Education (Professions), College Entrance Examinations
Kane, Michael T.; Mroch, Andrew A. – ETS Research Report Series, 2020
Ordinary least squares (OLS) regression and orthogonal regression (OR) address different questions and make different assumptions about errors. The OLS regression of Y on X yields predictions of a dependent variable (Y) contingent on an independent variable (X) and minimizes the sum of squared errors of prediction. It assumes that the independent…
Descriptors: Regression (Statistics), Least Squares Statistics, Test Bias, Error of Measurement
Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2019
The Mantel-Haenszel delta difference (MH D-DIF) and the standardized proportion difference (STD P-DIF) are two observed-score methods that have been used to assess differential item functioning (DIF) at Educational Testing Service since the early 1990s. Latentvariable approaches to assessing measurement invariance at the item level have been…
Descriptors: Test Bias, Educational Testing, Statistical Analysis, Item Response Theory
Schmidgall, Jonathan; Cid, Jaime; Carter Grissom, Elizabeth; Li, Lucy – ETS Research Report Series, 2021
The redesigned "TOEIC Bridge"® tests were designed to evaluate test takers' English listening, reading, speaking, and writing skills in the context of everyday adult life. In this paper, we summarize the initial validity argument that supports the use of test scores for the purpose of selection, placement, and evaluation of a test…
Descriptors: Language Tests, Second Language Learning, English (Second Language), Language Proficiency
Patrick Kyllonen; Amit Sevak; Teresa Ober; Ikkyu Choi; Jesse Sparks; Daniel Fishtein – ETS Research Report Series, 2024
Assessment refers to a broad array of approaches for measuring or evaluating a person's (or group of persons') skills, behaviors, dispositions, or other attributes. Assessments range from standardized tests used in admissions, employee selection, licensure examinations, and domestic and international large-scale assessments of cognitive and…
Descriptors: Assessment Literacy, Testing, Test Bias, Test Construction
Kim, Sooyeon; Robin, Frederic – ETS Research Report Series, 2017
In this study, we examined the potential impact of item misfit on the reported scores of an admission test from the subpopulation invariance perspective. The target population of the test consisted of 3 major subgroups with different geographic regions. We used the logistic regression function to estimate item parameters of the operational items…
Descriptors: Scores, Test Items, Test Bias, International Assessment
Deng, Weiling; Monfils, Lora – ETS Research Report Series, 2017
Using simulated data, this study examined the impact of different levels of stringency of the valid case inclusion criterion on item response theory (IRT)-based true score equating over 5 years in the context of K-12 assessment when growth in student achievement is expected. Findings indicate that the use of the most stringent inclusion criterion…
Descriptors: Item Response Theory, Equated Scores, True Scores, Educational Assessment
Fu, Jianbin – ETS Research Report Series, 2016
The multidimensional item response theory (MIRT) models with covariates proposed by Haberman and implemented in the "mirt" program provide a flexible way to analyze data based on item response theory. In this report, we discuss applications of the MIRT models with covariates to longitudinal test data to measure skill differences at the…
Descriptors: Item Response Theory, Longitudinal Studies, Test Bias, Goodness of Fit
Rios, Joseph A.; Sparks, Jesse R.; Zhang, Mo; Liu, Ou Lydia – ETS Research Report Series, 2017
Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack…
Descriptors: Test Construction, Test Validity, Writing Tests, College Outcomes Assessment
Liu, Ou Lydia; Mao, Liyang; Zhao, Tingting; Yang, Yi; Xu, Jun; Wang, Zhen – ETS Research Report Series, 2016
Chinese higher education is experiencing rapid development and growth. With tremendous resources invested in higher education, policy makers have requested more direct evidence of student learning. However, assessment tools that can be used to measure college-level learning are scarce in China. To mitigate this situation, we translated the…
Descriptors: Foreign Countries, Higher Education, Critical Thinking, College Students
Chubbuck, Kay; Curley, W. Edward; King, Teresa C. – ETS Research Report Series, 2016
This study gathered quantitative and qualitative evidence concerning gender differences in performance by using critical reading material on the "SAT"® test with sports and science content. The fundamental research questions guiding the study were: If sports and science are to be included in a skills test, what kinds of material are…
Descriptors: College Entrance Examinations, Gender Differences, Critical Reading, Reading Tests