Publication Date
| In 2026 | 0 |
| Since 2025 | 2 |
| Since 2022 (last 5 years) | 10 |
| Since 2017 (last 10 years) | 49 |
| Since 2007 (last 20 years) | 145 |
Descriptor
Source
Author
Publication Type
Education Level
Location
| Canada | 10 |
| Australia | 8 |
| Tennessee | 8 |
| United Kingdom | 7 |
| California | 4 |
| Kansas | 4 |
| Massachusetts | 4 |
| New Jersey | 4 |
| United States | 4 |
| Illinois | 3 |
| Michigan | 3 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Morgan, Deanna L. – National Center for Postsecondary Research, 2010
Cut scores are used in a variety of circumstances to aid in decision making through the establishment of a clear cut line between adjacent categories. Community colleges regularly use cut scores on placement tests to decide the appropriate course for each beginning student: the first college-level course or a developmental course, depending on…
Descriptors: Standard Setting (Scoring), Cutting Scores, Psychometrics, Best Practices
Wheadon, Christopher; Beguin, Anton – Assessment in Education: Principles, Policy & Practice, 2010
Tiering is a multi-stage test design whereby teachers allocate students to a particular difficulty level (tier) of a test. This approach to the challenge of delivering assessments to students with a heterogeneous ability distribution is normal practice in UK public examinations at the age of 16. This study uses Item Response Theory number-correct…
Descriptors: Difficulty Level, Item Response Theory, Achievement Tests, Standard Setting (Scoring)
Clauser, Brian E.; Harik, Polina; Margolis, Melissa J.; McManus, I. C.; Mollon, Jennifer; Chis, Liliana; Williams, Simon – Applied Measurement in Education, 2009
Numerous studies have compared the Angoff standard-setting procedure to other standard-setting methods, but relatively few studies have evaluated the procedure based on internal criteria. This study uses a generalizability theory framework to evaluate the stability of the estimated cut score. To provide a measure of internal consistency, this…
Descriptors: Generalizability Theory, Group Discussion, Standard Setting (Scoring), Scoring
Novakovic, Nadezda – International Journal of Training Research, 2008
The Angoff method is a widely used procedure for setting pass scores in vocational examinations, in which the awarders estimate the performance of minimally competent candidates (MCCs) on each test item. Within the context of some UK vocational examinations, the procedure consists of two stages: after making the first round of estimates, awarders…
Descriptors: Standard Setting (Scoring), Discussion, Statistical Data, Occupational Tests
Florez, Ida Rose – Civil Rights Project / Proyecto Derechos Civiles, 2010
The Arizona English Language Learners Assessment (AZELLA) is used by the Arizona Department of Education to determine which children should receive English support services. AZELLA results are used to determine if children are either proficient in English or have English language skills in one of four pre-proficient categories (pre-emergent,…
Descriptors: Validity, Second Language Learning, Cutting Scores, Kindergarten
Judd, Wallace – Practical Assessment, Research & Evaluation, 2009
Over the past twenty years in performance testing a specific item type with distinguishing characteristics has arisen time and time again. It's been invented independently by dozens of test development teams. And yet this item type is not recognized in the research literature. This article is an invitation to investigate the item type, evaluate…
Descriptors: Test Items, Test Format, Evaluation, Item Analysis
Dorans, Neil J.; Liang, Longjuan; Puhan, Gautam – Educational Testing Service, 2010
Scores are the most visible and widely used products of a testing program. The choice of score scale has implications for test specifications, equating, and test reliability and validity, as well as for test interpretation. At the same time, the score scale should be viewed as infrastructure likely to require repair at some point. In this report…
Descriptors: Testing Programs, Standard Setting (Scoring), Test Interpretation, Certification
Klenowski, Val; Wyatt-Smith, Claire – Australian Educational Researcher, 2010
While externally moderated standards-based assessment has been practised in Queensland senior schooling for more than three decades, there has been no such practice in the middle years. With the introduction of standards at state and national levels in these years, teacher judgement as developed in moderation practices is now vital. This paper…
Descriptors: Student Evaluation, Educational Change, Foreign Countries, Standard Setting (Scoring)
Egan, Karla L.; Ferrara, Steve; Schneider, M. Christina; Barton, Karen E. – Peabody Journal of Education, 2009
Alternate assessments of modified academic achievement standards (AA-MAS) must be designed, developed, implemented, and validated following the same rigorous principles and procedures used for other assessments. However, the uniqueness and unfamiliarity of the target population for these assessments requires innovative thinking, especially in…
Descriptors: Standard Setting (Scoring), Design Requirements, Academic Accommodations (Disabilities), Testing Accommodations
MacCann, Robert G. – Educational and Psychological Measurement, 2008
It is shown that the Angoff and bookmarking cut scores are examples of true score equating that in the real world must be applied to observed scores. In the context of defining minimal competency, the percentage "failed" by such methods is a function of the length of the measuring instrument. It is argued that this length is largely…
Descriptors: True Scores, Cutting Scores, Minimum Competencies, Scores
Bechger, Timo M.; Kuijper, Henk; Maris, Gunter – Language Assessment Quarterly, 2009
This article reports on two related studies carried out to link the State examination of Dutch as a second language to the Common European Framework of Reference for languages (CEFR). In the first study, key persons from institutions for higher education were asked to determine the minimally required language level of beginning students. In the…
Descriptors: Second Language Learning, Standard Setting (Scoring), Indo European Languages, Guidelines
Ferdous, Abdullah A.; Plake, Barbara S. – Educational and Psychological Measurement, 2008
Even when the scoring of an examination is based on item response theory (IRT), standard-setting methods seldom use this information directly when determining the minimum passing score (MPS) for an examination from an Angoff-based standard-setting study. Often, when IRT scoring is used, the MPS value for a test is converted to an IRT-based theta…
Descriptors: Standard Setting (Scoring), Scoring, Cutting Scores, Item Response Theory
Fowell, S. L.; Fewtrell, R.; McLaughlin, P. J. – Advances in Health Sciences Education, 2008
Absolute standard setting procedures are recommended for assessment in medical education. Absolute, test-centred standard setting procedures were introduced for written assessments in the Liverpool MBChB in 2001. The modified Angoff and Ebel methods have been used for short answer question-based and extended matching question-based papers,…
Descriptors: Medical Education, Standard Setting (Scoring), Judges, Interrater Reliability
Bowden, Stephen C.; Weiss, Lawrence G.; Holdnack, James A.; Bardenhagen, Fiona J.; Cook, Mark J. – Assessment, 2008
A psychological measurement model provides an explicit definition of (a) the theoretical and (b) the numerical relationships between observed scores and the latent variables that underlie the observed scores. Examination of the metric invariance of a measurement model involves testing the hypothesis that all components of the model relating…
Descriptors: Measurement Techniques, Foreign Countries, Cognitive Ability, Scores
Ferdous, Abdullah A.; Plake, Barbara S. – Educational and Psychological Measurement, 2005
In an Angoff standard-setting procedure, judges estimate the probability that a hypothetical randomly selected minimally competent candidate will answer correctly each item constituting the test. In many cases, these item performance estimates are made twice, with information shared with the judges between estimates. Especially for long tests,…
Descriptors: Test Items, Probability, Standard Setting (Scoring)

Peer reviewed
Direct link
