Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 1 |
Descriptor
Item Analysis | 12 |
Test Items | 9 |
Achievement Tests | 5 |
Analysis of Variance | 4 |
Elementary Education | 4 |
Higher Education | 4 |
Standard Setting (Scoring) | 4 |
Test Bias | 4 |
Difficulty Level | 3 |
Mathematics Tests | 3 |
Test Construction | 3 |
More ▼ |
Author
Plake, Barbara S. | 12 |
Hoover, H. D. | 4 |
Ferdous, Abdullah A. | 1 |
Huntley, Renee M. | 1 |
Melican, Gerald J. | 1 |
Wise, Steven L. | 1 |
Publication Type
Reports - Research | 10 |
Journal Articles | 7 |
Reports - Evaluative | 2 |
Speeches/Meeting Papers | 1 |
Education Level
Elementary Education | 1 |
Grade 8 | 1 |
Audience
Location
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
Iowa Tests of Basic Skills | 3 |
ACT Assessment | 1 |
What Works Clearinghouse Rating
Ferdous, Abdullah A.; Plake, Barbara S. – Educational and Psychological Measurement, 2007
In an Angoff standard setting procedure, judges estimate the probability that a hypothetical randomly selected minimally competent candidate will answer correctly each item in the test. In many cases, these item performance estimates are made twice, with information shared with the panelists between estimates. Especially for long tests, this…
Descriptors: Test Items, Probability, Item Analysis, Standard Setting (Scoring)
Plake, Barbara S.; Hoover, H. D. – 1978
A method of investigating for possible bias in test items is proposed that uses analysis of variance for item data based on groups that have been selected to have identical test score distributions. The item data used are arcsin transformations of item difficulties. The methodological procedure has the following advantages: (1) The arcsin…
Descriptors: Achievement Tests, Analysis of Variance, Difficulty Level, Item Analysis

Plake, Barbara S.; Hoover, H. D. – Journal of Experimental Education, 1979
A follow-up technique is needed to identify items contributing to items-by-groups interaction when using an ANOVA procedure to examine a test for biased items. The method described includes distribution theory for assessing level of significance and is sensitive to items at all difficulty levels. (Author/GSK)
Descriptors: Analysis of Variance, Goodness of Fit, Item Analysis, Statistical Bias

Plake, Barbara S.; Huntley, Renee M. – Educational and Psychological Measurement, 1984
Two studies examined the effect of making the correct answer of a multiple choice test item grammatically consistent with the item. American College Testing Assessment experimental items were constructed to investigate grammatical compliance to investigate grammatical compliance for plural-singular and vowel-consonant agreement. Results suggest…
Descriptors: Grammar, Higher Education, Item Analysis, Multiple Choice Tests
Plake, Barbara S.; And Others – 1989
The accuracy of standards obtained from judgmental methods is dependent on the quality of the judgments made by experts throughout the standard setting process. One important dimension of the quality of these judgments is the consistency of the judges' perceptions with item performance of minimally competent candidates. Several interrelated…
Descriptors: Cutting Scores, Evaluation Methods, Evaluative Thinking, Evaluators
Plake, Barbara S.; And Others – 1978
Three levels of the Iowa Tests of Basic Skills were studied to disclose the possible existence of sex bias in mathematics test items. Two mathematics tests (mathematical concepts and mathematics problem solving) and two comparison verbal tests (vocabulary and reading) were selected for analysis at three levels--grades 3, 6, and 8. Samples of 480…
Descriptors: Achievement Tests, Content Analysis, Elementary Education, Elementary School Mathematics

Plake, Barbara S. – Educational and Psychological Measurement, 1980
Analysis of variance and subjective rating by curriculum specialists were used to identify biased items on the Iowa Tests of Basic Skills. Results show little agreement between statistical and subjective methods. Test developers should statistically support a reviewer's selection of biased items. (Author/CP)
Descriptors: Achievement Tests, Analysis of Variance, Elementary Education, Evaluation Methods

Plake, Barbara S.; And Others – Journal of Educational Measurement, 1994
The comparability of Angoff-based item ratings on a general education test battery made by judges from within-content and across-content domains was studied. Results with 26 college faculty judges indicate that, at least for some tests, item ratings might be essentially equivalent regardless of judge's content specialty. (SLD)
Descriptors: College Faculty, Comparative Analysis, General Education, Higher Education
Plake, Barbara S.; Hoover, H. D. – 1977
To determine if equal raw scores have the same "meaning" for students tested "in" and "out-of-level," four "out-of-level" grade groups who took the Vocabulary, Reading, and Mathematics Concepts subtests of the Iowa Tests of Basic Skills typically administered to fifth graders were compared to "in…
Descriptors: Achievement Tests, Analysis of Variance, Elementary Education, Grade 4

Plake, Barbara S.; Hoover, H. D. – Journal of Educational Measurement, 1979
An experiment investigated the extent to which the results of out-of-level testing may be biased because the child given an out of level test may have had a significantly different curriculum than the children given in-level tests. Item analysis data suggested this was unlikely. (CTM)
Descriptors: Achievement Tests, Elementary Education, Elementary School Curriculum, Grade Equivalent Scores

Plake, Barbara S.; Melican, Gerald J. – Educational and Psychological Measurement, 1989
The impact of overall test length and difficulty on the expert judgments of item performance by the Nedelsky method were studied. Five university-level instructors predicting the performance of minimally competent candidates on a mathematics examination were fairly consistent in their assessments regardless of length or difficulty of the test.…
Descriptors: Difficulty Level, Estimation (Mathematics), Evaluators, Higher Education
Plake, Barbara S.; Wise, Steven L. – 1986
One question regarding the utility of adaptive testing is the effect of individualized item arrangements on examinee test scores. The purpose of this study was to analyze the item difficulty choices by examinees as a function of previous item performance. The examination was a 25-item test of basic algebra skills given to 36 students in an…
Descriptors: Adaptive Testing, Algebra, College Students, Computer Assisted Testing