Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 5 |
Descriptor
Source
Educational and Psychological… | 8 |
Journal of Educational… | 5 |
Journal of Experimental… | 3 |
Educational Assessment | 2 |
Applied Measurement in… | 1 |
College Board | 1 |
Online Submission | 1 |
Psychology of Women Quarterly | 1 |
Author
Publication Type
Reports - Research | 29 |
Journal Articles | 20 |
Speeches/Meeting Papers | 14 |
Reports - Evaluative | 8 |
Non-Print Media | 1 |
Reference Materials - General | 1 |
Reports - Descriptive | 1 |
Education Level
High Schools | 2 |
Higher Education | 2 |
Postsecondary Education | 2 |
Elementary Education | 1 |
Grade 8 | 1 |
Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
ACT Assessment | 3 |
Iowa Tests of Basic Skills | 3 |
State Trait Anxiety Inventory | 3 |
Mathematics Anxiety Rating… | 2 |
Advanced Placement… | 1 |
What Works Clearinghouse Rating
Kaliski, Pamela K.; Wind, Stefanie A.; Engelhard, George, Jr.; Morgan, Deanna L.; Plake, Barbara S.; Reshetar, Rosemary A. – Educational and Psychological Measurement, 2013
The many-faceted Rasch (MFR) model has been used to evaluate the quality of ratings on constructed response assessments; however, it can also be used to evaluate the quality of judgments from panel-based standard setting procedures. The current study illustrates the use of the MFR model for examining the quality of ratings obtained from a standard…
Descriptors: Item Response Theory, Models, Standard Setting (Scoring), Science Tests
Plake, Barbara S.; Huff, Kristen; Reshetar, Rosemary – College Board, 2009
[Slides] presented at the Annual Meeting of National Council on Measurement in Education (NCME) in San Diego, CA in April 2009. This presentation discusses a methodology for directly connecting evidence-centered assessment design (ECD) to score interpretation and use through the development of Achievement level descriptors.
Descriptors: Achievement, Classification, Evidence, Test Construction
Chang, Shu-Ren; Plake, Barbara S.; Kramer, Gene A.; Lien, Shu-Mei – Educational and Psychological Measurement, 2011
This study examined the amount of time that different ability-level examinees spend on questions they answer correctly or incorrectly across different pretest item blocks presented on a fixed-length, time-restricted computerized adaptive testing (CAT). Results indicate that different ability-level examinees require different amounts of time to…
Descriptors: Evidence, Test Items, Reaction Time, Adaptive Testing
Ferdous, Abdullah A.; Plake, Barbara S. – Educational and Psychological Measurement, 2007
In an Angoff standard setting procedure, judges estimate the probability that a hypothetical randomly selected minimally competent candidate will answer correctly each item in the test. In many cases, these item performance estimates are made twice, with information shared with the panelists between estimates. Especially for long tests, this…
Descriptors: Test Items, Probability, Item Analysis, Standard Setting (Scoring)
Ferdous, Abdullah A.; Plake, Barbara S. – Educational and Psychological Measurement, 2005
In an Angoff standard-setting procedure, judges estimate the probability that a hypothetical randomly selected minimally competent candidate will answer correctly each item constituting the test. In many cases, these item performance estimates are made twice, with information shared with the judges between estimates. Especially for long tests,…
Descriptors: Test Items, Probability, Standard Setting (Scoring)

Plake, Barbara S.; Impara, James C.; Irwin, Patrick M. – Journal of Educational Measurement, 2000
Examined intra- and inter-rater consistency of item performance estimated from an Angoff standard setting over 2 years, with 29 panelists one year, and 30 the next. Results provide evidence that item performance estimates were consistent within and across panels within and across years. Factors that might have influenced this high degree of…
Descriptors: Evaluators, Prediction, Reliability, Standard Setting

Plake, Barbara S. – Journal of Experimental Education, 1980
Three-item orderings and two levels of knowledge of ordering were used to study differences in test results, student's perception of the test's fairness and difficulty, and student's estimation of test performance. No significant order effect was found. (Author/GK)
Descriptors: Difficulty Level, Higher Education, Scores, Test Format

Plake, Barbara S.; Impara, James C. – Educational Assessment, 2001
Examined the reliability and accuracy of item performance estimates from an Angoff standard setting application with 29 panelists on 1 year and 30 in the next year. Results provide evidence that item performance estimates were both reasonable and reliable. Discusses factors that might have influenced the results. (SLD)
Descriptors: Estimation (Mathematics), Evaluators, Performance Factors, Reliability
Plake, Barbara S.; Giraud, Gerald – 1998
In the traditional Angoff Standard Setting Method, experts are instructed to predict the possibility that a randomly selected, hypothetical minimally competent candidate will be able to answer each multiple choice question in the test correctly. These item performance estimates are averaged across panelists and aggregated to determine the minimum…
Descriptors: Estimation (Mathematics), Evaluators, Performance Factors, Standard Setting (Scoring)
Ferdous, Abdullah A.; Plake, Barbara S.; Chang, Shu-Ren – Educational Assessment, 2007
The purpose of this study was to examine the effect of pretest items on response time in an operational, fixed-length, time-limited computerized adaptive test (CAT). These pretest items are embedded within the CAT, but unlike the operational items, are not tailored to the examinee's ability level. If examinees with higher ability levels need less…
Descriptors: Pretests Posttests, Reaction Time, Computer Assisted Testing, Test Items
De Ayala, R. J.; Plake, Barbara S.; Impara, James C.; Kozmicky, Michelle – 2000
This study investigated the effect on examinees' ability estimate under item response theory (IRT) when they are presented an item, have ample time to answer the item, but decide not to respond to the item. Simulation data were modeled on an empirical data set of 25,546 examinees that was calibrated using the 3-parameter logistic model. The study…
Descriptors: Ability, Estimation (Mathematics), Item Response Theory, Maximum Likelihood Statistics
Plake, Barbara S.; Hoover, H. D. – 1978
A method of investigating for possible bias in test items is proposed that uses analysis of variance for item data based on groups that have been selected to have identical test score distributions. The item data used are arcsin transformations of item difficulties. The methodological procedure has the following advantages: (1) The arcsin…
Descriptors: Achievement Tests, Analysis of Variance, Difficulty Level, Item Analysis

Plake, Barbara S.; And Others – Educational and Psychological Measurement, 1983
The purpose of this study was to investigate further the effect of differential item performance by males and females on tests which have different item arrangements. The study allows for a more accurate evaluation of whether differential sensitivity to reinforcement strategies is a factor in performance discrepancies for males and females.…
Descriptors: Feedback, Higher Education, Performance Factors, Quantitative Tests
Irwin, Patrick M.; Plake, Barbara S.; Impara, James C. – 2000
Judgmental standard setting methods, such as the W. H. Angoff (1971) method, use item performance estimates as the basis for determining the minimum passing score (MPS). Therefore, the accuracy of these item performance estimates is crucial to the validity of the resulting MPS. Recent researchers, (L. A. Shephard 1994; J. Impara, 1997) have called…
Descriptors: Estimation (Mathematics), Judges, Licensing Examinations (Professions), Performance Factors

Plake, Barbara S.; Hoover, H. D. – Journal of Experimental Education, 1979
A follow-up technique is needed to identify items contributing to items-by-groups interaction when using an ANOVA procedure to examine a test for biased items. The method described includes distribution theory for assessing level of significance and is sensitive to items at all difficulty levels. (Author/GSK)
Descriptors: Analysis of Variance, Goodness of Fit, Item Analysis, Statistical Bias