Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 6 |
Descriptor
Test Items | 44 |
Test Use | 44 |
Test Construction | 26 |
Achievement Tests | 12 |
Testing Programs | 12 |
Scoring | 11 |
Standardized Tests | 11 |
Test Validity | 11 |
Educational Assessment | 9 |
Student Evaluation | 9 |
High Schools | 8 |
More ▼ |
Source
Author
Kitao, Kenji | 2 |
Kitao, S. Kathleen | 2 |
Beetham, James | 1 |
Bowman, Harry L. | 1 |
Butler, E. Dean | 1 |
Cabrera, George A. | 1 |
Cabrera, Nolan L. | 1 |
Cole, Nancy S. | 1 |
Colvin, Stephen S. | 1 |
Dietel, Ron | 1 |
Dietel, Ronald | 1 |
More ▼ |
Publication Type
Reports - Descriptive | 44 |
Guides - Non-Classroom | 13 |
Journal Articles | 12 |
Speeches/Meeting Papers | 6 |
Tests/Questionnaires | 5 |
Historical Materials | 2 |
Collected Works - General | 1 |
Education Level
Elementary Secondary Education | 5 |
Elementary Education | 2 |
High Schools | 2 |
Secondary Education | 2 |
Grade 6 | 1 |
Higher Education | 1 |
Postsecondary Education | 1 |
Laws, Policies, & Programs
Comprehensive Education… | 2 |
National Defense Education Act | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Stewart, Gail; Strachan, Andrea – TESL Canada Journal, 2022
Since its implementation in 2004, the Canadian English Language Benchmark Assessment for Nurses (CELBAN) has been accepted as evidence of language ability for licensure of internationally educated nurses (IENs) in Canada. This article focuses on the complexities of sustaining an occupation-specific assessment over time. The authors reference the…
Descriptors: Language Tests, English for Special Purposes, Benchmarking, Nurses
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2018
The choice of anchor tests is crucial in applications of the nonequivalent groups with anchor test design of equating. Sinharay and Holland (2006, 2007) suggested "miditests," which are anchor tests that are content-representative and have the same mean item difficulty as the total test but have a smaller spread of item difficulties.…
Descriptors: Test Content, Difficulty Level, Test Items, Test Construction
College Board, 2023
Over the past several years, content experts, psychometricians, and researchers have been hard at work developing, refining, and studying the digital SAT. The work is grounded in foundational best practices and advances in measurement and assessment design, with fairness for students informing all of the work done. This paper shares learnings from…
Descriptors: College Entrance Examinations, Psychometrics, Computer Assisted Testing, Best Practices
Cabrera, Nolan L.; Cabrera, George A. – Educational Horizons, 2011
Just like all the high-stakes tests that determine students' futures nowadays, The Chorizo Test is a standardized test rooted in the culture of the test makers. It was originally created to be used with students in teacher training programs to sensitize them to the pitfalls inherent in standardized pencil-and-paper tests, such as linguistic bias…
Descriptors: Test Use, Standardized Tests, Social Sciences, High Stakes Tests
National Council on Measurement in Education, 2012
Testing and data integrity on statewide assessments is defined as the establishment of a comprehensive set of policies and procedures for: (1) the proper preparation of students; (2) the management and administration of the test(s) that will lead to accurate and appropriate reporting of assessment results; and (3) maintaining the security of…
Descriptors: State Programs, Integrity, Testing, Test Preparation
Herman, Joan L.; Osmundson, Ellen; Dietel, Ronald – Assessment and Accountability Comprehensive Center, 2010
This report describes the purposes of benchmark assessments and provides recommendations for selecting and using benchmark assessments--addressing validity, alignment, reliability, fairness and bias and accessibility, instructional sensitivity, utility, and reporting issues. We also present recommendations on building capacity to support schools'…
Descriptors: Multiple Choice Tests, Test Items, Benchmarking, Educational Assessment

Cole, Nancy S.; Zieky, Michael J. – Journal of Educational Measurement, 2001
Proposes additional ways for people in the measurement profession to think about the fairness of assessments and about the fairness of the uses of assessments. Suggests that measurement professionals must pay more attention to reducing group differences at the design stage of test development, to providing all examinees an opportunity to…
Descriptors: Educational Testing, Equal Education, Groups, Test Bias

Harnisch, Delwyn L. – Journal of Educational Measurement, 1983
The Student-Problem (S-P) methodology is described using an example of 24 students on a test of 44 items. Information based on the students' test score and the modified caution index is put to diagnostic use. A modification of the S-P methodology is applied to domain-referenced testing. (Author/CM)
Descriptors: Academic Achievement, Educational Practices, Item Analysis, Responses

Henderson, Metta Lou – American Journal of Pharmaceutical Education, 1984
The uses, advantages and disadvantages, preparation, and scoring of essay tests and oral tests are outlined and discussed, and sample questions of each type oriented to pharmaceutical instruction are provided. (MSE)
Descriptors: Essay Tests, Higher Education, Pharmaceutical Education, Scoring
Merz, William R. – 1980
Several methods of assessing test item bias are described, and the concept of fair use of tests is examined. A test item is biased if individuals of equal ability have different probabilities of attaining the item correct. The following seven general procedures used to examine test items for bias are summarized and discussed: (1) analysis of…
Descriptors: Comparative Analysis, Evaluation Methods, Factor Analysis, Mathematical Models

Dolinsky, Donna; Reid, Vincent E. – American Journal of Pharmaceutical Education, 1984
Cognitive learning and cognitive measures are defined and various types of objective measures of cognitive learning are discussed and compared, including short answer test items, true-false items, multiple choice items, matching items, and written simulations. (MSE)
Descriptors: Cognitive Tests, Comparative Analysis, Higher Education, Measurement Techniques
Alaska State Dept. of Education, Juneau. – 1999
This booklet is an explanation of what the Alaska High School Graduation Qualifying Examination means to Alaskans and how it fits into a larger school accountability reform initiative. The high school class of 2002 is the first group of students who will need to pass the High School Graduation Qualifying Examination to receive a high school…
Descriptors: Accountability, Achievement Tests, Educational Change, Exit Examinations
Margolis, Neal – Performance and Instruction, 1989
Explores various issues related to empirical validation of computer documentation. Topics discussed include developing validation test items; selecting test users; costs; benefits of testing for the writer; the role of the writer and editors; conducting the validation test; and writing revision specifications. (LRW)
Descriptors: Authors, Computer Software, Costs, Editors

Snow, Catherine E.; And Others – Journal of Research in Childhood Education, 1995
Reports on a battery of oral language and early literacy tests, called the SHELL. Describes the tests, presents tasks and scoring system, and provides information about performance by participants in the Home-School Study of Language and Literacy Development. Descriptive, correlational, and predictive analyses based on SHELL-K (kindergarten) and…
Descriptors: Early Childhood Education, Emergent Literacy, Grade 1, Kindergarten
Dodds, Jeffrey – 1999
Basic precepts for test development are described and explained as they are presented in measurement textbooks commonly used in the fields of education and psychology. The five building blocks discussed as the foundation of well-constructed tests are: (1) specification of purpose; (2) standard conditions; (3) consistency; (4) validity; and (5)…
Descriptors: Difficulty Level, Educational Research, Grading, Higher Education