Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 7 |
Descriptor
Source
Journal of Educational… | 7 |
College Board | 2 |
Applied Measurement in… | 1 |
ETS Research Report Series | 1 |
Educational Measurement:… | 1 |
Educational Research and… | 1 |
Review of Research in… | 1 |
Author
Dorans, Neil J. | 5 |
Lawrence, Ida M. | 5 |
Freedle, Roy | 3 |
Kostin, Irene | 3 |
Schmitt, Alicia P. | 3 |
Wainer, Howard | 3 |
Eignor, Daniel R. | 2 |
Adedoyin, O. O. | 1 |
Allen, Nancy L. | 1 |
Attali, Yigal | 1 |
Braswell, James S. | 1 |
More ▼ |
Publication Type
Reports - Evaluative | 32 |
Journal Articles | 12 |
Speeches/Meeting Papers | 6 |
Opinion Papers | 2 |
Education Level
Higher Education | 4 |
Postsecondary Education | 4 |
Secondary Education | 3 |
Elementary Secondary Education | 2 |
High Schools | 1 |
Audience
Location
Botswana | 1 |
California | 1 |
United Kingdom | 1 |
United States | 1 |
Laws, Policies, & Programs
Individuals with Disabilities… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
SAT (College Admission Test) | 32 |
Graduate Record Examinations | 4 |
Preliminary Scholastic… | 2 |
ACT Assessment | 1 |
Graduate Management Admission… | 1 |
National Assessment of… | 1 |
What Works Clearinghouse Rating
Attali, Yigal; Saldivia, Luis; Jackson, Carol; Schuppan, Fred; Wanamaker, Wilbur – ETS Research Report Series, 2014
Previous investigations of the ability of content experts and test developers to estimate item difficulty have, for themost part, produced disappointing results. These investigations were based on a noncomparative method of independently rating the difficulty of items. In this article, we argue that, by eliciting comparative judgments of…
Descriptors: Test Items, Difficulty Level, Comparative Analysis, College Entrance Examinations
Dorans, Neil J. – Educational Measurement: Issues and Practice, 2012
Views on testing--its purpose and uses and how its data are analyzed--are related to one's perspective on test takers. Test takers can be viewed as learners, examinees, or contestants. I briefly discuss the perspective of test takers as learners. I maintain that much of psychometrics views test takers as examinees. I discuss test takers as a…
Descriptors: Testing, Test Theory, Item Response Theory, Test Reliability
College Board, 2010
This is the College Board's response to a research article by Drs. Maria Veronica Santelices and Mark Wilson in the Harvard Educational Review, entitled "Unfair Treatment? The Case of Freedle, the SAT, and the Standardization Approach to Differential Item Functioning" (see EJ930622).
Descriptors: Test Bias, College Entrance Examinations, Standardized Tests, Test Items
Wise, Steven L.; Pastor, Dena A.; Kong, Xiaojing J. – Applied Measurement in Education, 2009
Previous research has shown that rapid-guessing behavior can degrade the validity of test scores from low-stakes proficiency tests. This study examined, using hierarchical generalized linear modeling, examinee and item characteristics for predicting rapid-guessing behavior. Several item characteristics were found significant; items with more text…
Descriptors: Guessing (Tests), Achievement Tests, Correlation, Test Items
Adedoyin, O. O. – Educational Research and Reviews, 2010
This is a quantitative study, which attempted to detect gender bias test items from the Botswana Junior Certificate Examination in mathematics. To detect gender bias test items, a randomly selected sample of 4000 students responses to mathematics paper 1 of the Botswana Junior Certificate examination were selected from 36,000 students who sat for…
Descriptors: Test Items, Foreign Countries, Statistical Analysis, Gender Bias
Gierl, Mark J.; Cui, Ying; Zhou, Jiawen – Journal of Educational Measurement, 2009
The attribute hierarchy method (AHM) is a psychometric procedure for classifying examinees' test item responses into a set of structured attribute patterns associated with different components from a cognitive model of task performance. Results from an AHM analysis yield information on examinees' cognitive strengths and weaknesses. Hence, the AHM…
Descriptors: Test Items, True Scores, Psychometrics, Algebra
Dorans, Neil J.; Lawrence, Ida M. – 1988
A procedure for checking the score equivalence of nearly identical editions of a test is described. The procedure employs the standard error of equating (SEE) and utilizes graphical representation of score conversion deviation from the identity function in standard error units. Two illustrations of the procedure involving Scholastic Aptitude Test…
Descriptors: Equated Scores, Error of Measurement, Test Construction, Test Format
Lawrence, Ida M. – 1995
This study examined to what extent, if any, estimates of reliability for a multiple choice test are affected by the presence of large item sets where each set shares common reading material. The purpose of this research was to assess the effect of local item dependence on estimates of reliability for verbal portions of seven forms of the old and…
Descriptors: Estimation (Mathematics), High Schools, Multiple Choice Tests, Reading Tests

Wainer, Howard; And Others – Journal of Educational Measurement, 1991
A testlet is an integrated group of test items presented as a unit. The concept of testlet differential item functioning (testlet DIF) is defined, and a statistical method is presented to detect testlet DIF. Data from a testlet-based experimental version of the Scholastic Aptitude Test illustrate the methodology. (SLD)
Descriptors: College Entrance Examinations, Definitions, Graphs, Item Bias
Pomplun, Mark; And Others – 1992
This study evaluated the use of bivariate matching as a solution to the problem of studying differential item functioning (DIF) with formula scored tests. Using Scholastic Aptitude Test verbal data with large samples, both male/female and black/white group comparisons were investigated. Mantel-Haenszel (MH) delta-(D) DIF values and DIF category…
Descriptors: Blacks, Criteria, Females, Item Bias
Wainer, Howard; And Others – 1991
It is sometimes sensible to think of the fundamental unit of test construction as being larger than an individual item. This unit, dubbed the testlet, must pass muster in the same way that items do. One criterion of a good item is the absence of differential item functioning (DIF). The item must function in the same way as all important…
Descriptors: Definitions, Identification, Item Bias, Item Response Theory

Green, Bert F.; And Others – Journal of Educational Measurement, 1989
A method of analyzing test item responses is advocated to examine differential item functioning through distractor choices of those answering an item incorrectly. The analysis uses log-linear models of a three-way contingency table, and is illustrated in an analysis of the verbal portion of the Scholastic Aptitude Test. (TJH)
Descriptors: College Entrance Examinations, Distractors (Tests), Evaluation Methods, High School Students
Freedle, Roy; Kostin, Irene – 1992
This study examines the predictability of Graduate Record Examinations (GRE) reading item difficulty (equated delta) for the three major reading item types: main idea, inference, and explicit statement items. Each item type is analyzed separately, using 110 GRE reading passages and their associated 244 reading items; selective analyses of 285…
Descriptors: College Entrance Examinations, Correlation, Difficulty Level, Higher Education
Eignor, Daniel R. – 1993
Procedures used to establish the comparability of scores derived from the College Board Admissions Testing Program (ATP) computer adaptive Scholastic Aptitude Test (SAT) prototype and the paper-and-pencil SAT are described in this report. Both the prototype, which is made up of Verbal and Mathematics computer adaptive tests (CATs), and a form of…
Descriptors: Adaptive Testing, College Entrance Examinations, Comparative Analysis, Computer Assisted Testing
Allen, Nancy L.; Wainer, Howard – 1989
The accuracy of procedures that are used to compare the performance of different groups of examinees on test items obviously depends on the correct classification of members in each examinee group. The significance of this dependence is determined by the sensitivity of the statistical procedure and the proportion of examinees who are unidentified.…
Descriptors: Blacks, Comparative Analysis, Ethnicity, Identification