Publication Date
| In 2026 | 0 |
| Since 2025 | 38 |
| Since 2022 (last 5 years) | 225 |
| Since 2017 (last 10 years) | 570 |
| Since 2007 (last 20 years) | 1377 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 110 |
| Practitioners | 107 |
| Teachers | 46 |
| Administrators | 25 |
| Policymakers | 24 |
| Counselors | 12 |
| Parents | 7 |
| Students | 7 |
| Support Staff | 4 |
| Community | 2 |
Location
| California | 61 |
| Canada | 60 |
| United States | 57 |
| Turkey | 47 |
| Australia | 43 |
| Florida | 34 |
| Germany | 26 |
| Texas | 26 |
| China | 25 |
| Netherlands | 25 |
| Iran | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
| Does not meet standards | 1 |
Arendasy, Martin E.; Sommer, Markus – Intelligence, 2012
There is a heated debate on whether observed gender differences in some figural matrices in adults can be attributed to gender differences in inductive reasoning/G[subscript f] or differential item functioning and/or test bias. Based on previous studies we hypothesized that three specific item design features moderate the effect size of the gender…
Descriptors: Test Items, Item Response Theory, Males, Test Bias
Liu, Qian – ProQuest LLC, 2011
For this dissertation, four item purification procedures were implemented onto the generalized linear mixed model for differential item functioning (DIF) analysis, and the performance of these item purification procedures was investigated through a series of simulations. Among the four procedures, forward and generalized linear mixed model (GLMM)…
Descriptors: Test Bias, Test Items, Statistical Analysis, Models
Park, Sangwook – ProQuest LLC, 2011
Many studies have been conducted to evaluate the performance of DIF detection methods, when two groups have different ability distributions. Such studies typically have demonstrated factors that are associated with inflation of Type I error rates in DIF detection, such as mean ability differences. However, no study has examined how the direction…
Descriptors: Test Bias, Regression (Statistics), Sample Size, Simulation
Elosua, Paula – Psicologica: International Journal of Methodology and Experimental Psychology, 2011
Assessing measurement equivalence in the framework of the common factor linear models (CFL) is known as factorial invariance. This methodology is used to evaluate the equivalence among the parameters of a measurement model among different groups. However, when dichotomous, Likert, or ordered responses are used, one of the assumptions of the CFL is…
Descriptors: Measurement, Models, Data, Factor Analysis
DeMars, Christine E.; Lau, Abigail – Educational and Psychological Measurement, 2011
There is a long history of differential item functioning (DIF) detection methods for known, manifest grouping variables, such as sex or ethnicity. But if the experiences or cognitive processes leading to DIF are not perfectly correlated with the manifest groups, it would be more informative to uncover the latent groups underlying DIF. The use of…
Descriptors: Test Bias, Accuracy, Item Response Theory, Models
Woods, Carol M. – Applied Psychological Measurement, 2011
Differential item functioning (DIF) occurs when an item on a test, questionnaire, or interview has different measurement properties for one group of people versus another. One way to test items with ordinal response scales for DIF is likelihood ratio (LR) testing using item response theory (IRT), or IRT-LR-DIF. Despite the various advantages of…
Descriptors: Test Bias, Test Items, Item Response Theory, Nonparametric Statistics
Attali, Yigal – Educational and Psychological Measurement, 2011
Contrary to previous research on sequential ratings of student performance, this study found that professional essay raters of a large-scale standardized testing program produced ratings that were drawn toward previous ratings, creating an assimilation effect. Longer intervals between the two adjacent ratings and higher degree of agreement with…
Descriptors: Essay Tests, Standardized Tests, Sequential Approach, Test Bias
Alweis, Richard L.; Fitzpatrick, Caroline; Donato, Anthony A. – Journal of Education and Training Studies, 2015
Introduction: The Multiple Mini-Interview (MMI) format appears to mitigate individual rater biases. However, the format itself may introduce structural systematic bias, favoring extroverted personality types. This study aimed to gain a better understanding of these biases from the perspective of the interviewer. Methods: A sample of MMI…
Descriptors: Interviews, Interrater Reliability, Qualitative Research, Semi Structured Interviews
Blömeke, Sigrid; Suhl, Ute; Döhrmann, Martina – International Journal of Science and Mathematics Education, 2013
The "Teacher Education and Development Study in Mathematics" assessed the knowledge of primary and lower-secondary teachers at the end of their training. The large-scale assessment represented the common denominator of what constitutes mathematics content knowledge and mathematics pedagogical content knowledge in the 16 participating…
Descriptors: Elementary School Teachers, Secondary School Teachers, Preservice Teachers, Mathematics Education
Albano, Anthony D. – Journal of Educational Measurement, 2013
In many testing programs it is assumed that the context or position in which an item is administered does not have a differential effect on examinee responses to the item. Violations of this assumption may bias item response theory estimates of item and person parameters. This study examines the potentially biasing effects of item position. A…
Descriptors: Test Items, Item Response Theory, Test Format, Questioning Techniques
Coniam, David; Falvey, Peter – Language Testing, 2013
The "Language Proficiency Assessment for Teachers of English" (LPATE) is a test of standards of English language ability for Hong Kong primary and secondary school teachers of English. The impetus for the creation of the LPATE arose, in 1996, because of concerns in business and education communities over falling English language…
Descriptors: English Teachers, Elementary School Teachers, Secondary School Teachers, Language Tests
Gandy, Sandra E. – Reading & Writing Quarterly, 2013
With the increasing amount of testing taking place in classrooms, teachers may question how appropriate those assessments are for the growing numbers of English language learners (ELLs) in the United States. One of the assessment options for classroom teachers is the informal reading inventory (IRI), which is the most frequently used assessment…
Descriptors: Informal Reading Inventories, English Language Learners, Student Evaluation, Standardized Tests
Hinds, Drew Samuel Wayne – ProQuest LLC, 2013
Alternative high schools serve some of the most vulnerable students and their programs present a significant challenge to evaluate. Determining the impact of an alternative high school that serves mostly at-risk students presented a significant research problem. Few studies exist that dig deeper into the characteristics and strategies of…
Descriptors: Nontraditional Education, High School Students, Program Effectiveness, At Risk Students
Partnership for Assessment of Readiness for College and Careers, 2017
The Partnership for Assessment of Readiness for College and Careers (PARCC) is a state-led consortium designed to create next-generation assessments that, compared to traditional K-12 assessments, more accurately measure student progress toward college and career readiness. The PARCC assessments are aligned to the Common Core State Standards…
Descriptors: College Readiness, Career Readiness, Common Core State Standards, Language Arts
Plassmann, Sibylle; Zeidler, Beate – Language Learning in Higher Education, 2014
Language testing means taking decisions: about the test taker's results, but also about the test construct and the measures taken in order to ensure quality. This article takes the German test "telc Deutsch C1 Hochschule" as an example to illustrate this decision-making process in an academic context. The test is used for university…
Descriptors: Language Tests, Test Wiseness, Test Construction, Decision Making

Peer reviewed
Direct link
