Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 8 |
Descriptor
Source
Author
Publication Type
Reports - Descriptive | 30 |
Journal Articles | 20 |
Opinion Papers | 6 |
Speeches/Meeting Papers | 2 |
Collected Works - Serials | 1 |
Dissertations/Theses -… | 1 |
Reference Materials -… | 1 |
Reports - Evaluative | 1 |
Tests/Questionnaires | 1 |
Education Level
Elementary Secondary Education | 2 |
Higher Education | 2 |
Audience
Practitioners | 2 |
Counselors | 1 |
Researchers | 1 |
Teachers | 1 |
Location
United Kingdom | 2 |
California | 1 |
Vermont | 1 |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 2 |
Assessments and Surveys
Iowa Tests of Basic Skills | 1 |
National Assessment of… | 1 |
What Works Clearinghouse Rating
Zumbo, Bruno D.; Hubley, Anita M. – Assessment in Education: Principles, Policy & Practice, 2016
Ultimately, measures in research, testing, assessment and evaluation are used, or have implications, for ranking, intervention, feedback, decision-making or policy purposes. Explicit recognition of this fact brings the often-ignored and sometimes maligned concept of consequences to the fore. Given that measures have personal and social…
Descriptors: Testing Programs, Testing Problems, Measurement Techniques, Student Evaluation
Ferrara, Steve – Educational Measurement: Issues and Practice, 2017
Test security is not an end in itself; it is important because we want to be able to make valid interpretations from test scores. In this article, I propose a framework for comprehensive test security systems: prevention, detection, investigation, and resolution. The article discusses threats to test security, roles and responsibilities, rigorous…
Descriptors: Testing Programs, Educational Practices, Educational Policy, Program Improvement
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Popham, W. James – Phi Delta Kappan, 2014
The tests we use to evaluate student achievement may well be sound measures of what students know, but they are faulty indicators at best of how well they have been taught. A remedy to this this situation of judging teachers by the performance of their students on high-stakes tests may be in hand already. We should look to the methods successfully…
Descriptors: High Stakes Tests, Academic Achievement, Teacher Evaluation, Evaluation Methods
Geisinger, Kurt F. – International Journal of Testing, 2012
This article sets the stage for the description of a variety of approaches to test reviewing worldwide. It describes the importance of test reviewing as a protection of the public and of society and also the benefits of this activity for test users, who must choose measures to use in particular situations with particular clients at a particular…
Descriptors: Test Reviews, Evaluation Methods, Evaluation Criteria, Global Approach
Tanner, John R. – School Administrator, 2011
State test scores administered for accountability purposes are regularly used to adjust instruction in nuanced ways. This is no accident--No Child Left Behind demanded that students' scores be returned quickly to teachers in order that this might be the case, and the idea of data-driven decision making continues as one way the promise of education…
Descriptors: Federal Legislation, Standardized Tests, Educational Change, Decision Making
Wright, Robert E. – College Student Journal, 2010
The use of standardized tests for outcome assessment has grown dramatically in recent years. Two driving factors have been the No Child Left Behind legislation, and the increase in outcome assessment measures by accrediting agencies such as AACSB, the international accrediting body for business schools. Despite the growth in usage, little effort…
Descriptors: College Outcomes Assessment, Educational Testing, Standardized Tests, Accreditation (Institutions)

Burton, Richard F. – Assessment & Evaluation in Higher Education, 2001
Item-discrimination indices are numbers calculated from test data that are used in assessing the effectiveness of individual test questions. This article asserts that the indices are so unreliable as to suggest that countless good questions may have been discarded over the years. It considers how the indices, and hence overall test reliability,…
Descriptors: Guessing (Tests), Item Analysis, Test Reliability, Testing Problems

Lichtenstein, Robert – Exceptional Children, 1982
The need for reliable, valid, and economical preschool screening measures to identify psychoeducational problems as mandated by P.L. 94-142 (the Education for All Handicapped Child Act) led to the development of the Minneapolis Preschool Screening Instrument (MPSI). (SW)
Descriptors: Disabilities, Disability Identification, Early Identification, Preschool Education

Brown, Jonathan R. – Language, Speech, and Hearing Services in Schools, 1989
The importance of using the standard error of measurement (SEm) in determining reliability in test scores is emphasized. The SEm is compared to the hypothetical true score for standardized tests, and procedures for calculation of the SEm are explained. (JDD)
Descriptors: Elementary Secondary Education, Error of Measurement, Scores, Standardized Tests
Cooper, Terence H. – Journal of Agronomic Education (JAE), 1988
Describes a study used to determine differences in exam reliability, difficulty, and student evaluations. Indicates that when a fourth option was added to the three-option items, the exams became more difficult. Includes methods, results discussion, and tables on student characteristics, whole test analyses, and selected items. (RT)
Descriptors: Agronomy, College Science, Error of Measurement, Evaluation Methods
Phillips, Gary W., Ed. – 1996
Recently, there has been a significant expansion in the use of performance assessment in large scale testing programs. Although there has been significant support from curriculum and policy stakeholders, the technical feasibility of large scale performance assessments has remained a question. This report is intended to contribute to the debate by…
Descriptors: Comparative Analysis, Generalizability Theory, Performance Based Assessment, Psychometrics
Terry, Julie – 1996
Year after year, students, teachers, administrators, politicians, and parents are faced with the dilemma of reading assessment. Sheila Valencia (1990) feels that reading assessment has become a hot topic because the outcome tends to differ from school to school. Evaluations should be authentic and trustworthy. Roger Farr (1992) has noted that the…
Descriptors: Academic Achievement, Elementary Secondary Education, Evaluation Methods, Performance Based Assessment

Moy, Raymond H. – System, 1984
Discusses the problems associated with "grading on a curve," the approach often used for standard setting on language proficiency tests. Proposes four main steps presented in the setting of a non-arbitrary cut-score. These steps not only establish a proficiency standard checked by external criteria, but also check to see that the test covers the…
Descriptors: Cloze Procedure, Correlation, Dictation, English (Second Language)

Holmes, Susan E. – Evaluation and the Health Professions, 1986
A specific application of test equating is described, namely that of credentialing examination programs in the health professions. Considered are: (1) the role of test equating in the credentialing process; and (2) the issues that must be considered when implementing test equating in a credentialing examination program. (Author/LMO)
Descriptors: Certification, Credentials, Data Collection, Equated Scores
Previous Page | Next Page ยป
Pages: 1 | 2