Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 0 |
| Since 2007 (last 20 years) | 11 |
Descriptor
| Educational Testing | 26 |
| Validity | 26 |
| Reliability | 25 |
| Evaluation Methods | 10 |
| Student Evaluation | 10 |
| Scores | 8 |
| Educational Assessment | 6 |
| Elementary Secondary Education | 6 |
| Measurement | 6 |
| Academic Achievement | 5 |
| Accountability | 5 |
| More ▼ | |
Source
Author
| Haberman, Shelby J. | 4 |
| Sinharay, Sandip | 3 |
| Attali, Yigal | 1 |
| Berk, Ronald A. | 1 |
| Brennan, Robert L. | 1 |
| Burstein, Jill | 1 |
| Cizek, Gregory J. | 1 |
| Dahl, Theodore | 1 |
| Denham, Thomas J. | 1 |
| Erickson, Richard C. | 1 |
| Faraday, Sally | 1 |
| More ▼ | |
Publication Type
Education Level
| Elementary Secondary Education | 4 |
| Higher Education | 2 |
| Adult Education | 1 |
| High Schools | 1 |
| Postsecondary Education | 1 |
| Secondary Education | 1 |
Audience
| Practitioners | 3 |
| Teachers | 2 |
| Administrators | 1 |
Location
| United Kingdom | 3 |
| New York | 2 |
| United States | 2 |
| United Kingdom (England) | 1 |
| United Kingdom (Great Britain) | 1 |
| United Kingdom (Wales) | 1 |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 1 |
| Race to the Top | 1 |
Assessments and Surveys
| Myers Briggs Type Indicator | 1 |
| Stanford Achievement Tests | 1 |
What Works Clearinghouse Rating
Sinharay, Sandip; Haberman, Shelby J.; Wainer, Howard – Educational and Psychological Measurement, 2011
There are several techniques that increase the precision of subscores by borrowing information from other parts of the test. These techniques have been criticized on validity grounds in several of the recent publications. In this note, the authors question the argument used in these publications and suggest both inherent limits to the validity…
Descriptors: Scores, Methods, Validity, Reliability
Berk, Ronald A. – Journal of Faculty Development, 2016
Recently, student outcomes have bubbled to the top of debates about how to evaluate teaching in community and liberal arts colleges, universities, and professional schools, but even more international attention has been riveted on how outcomes are being used to evaluate teachers and administrators K-12 (Harris, 2012; Rowen & Raudenbush, 2016;…
Descriptors: Value Added Models, Academic Achievement, Outcomes of Education, Teacher Evaluation
Qi, Sen; Mitchell, Ross E. – Journal of Deaf Studies and Deaf Education, 2012
The first large-scale, nationwide academic achievement testing program using Stanford Achievement Test (Stanford) for deaf and hard-of-hearing children in the United States started in 1969. Over the past three decades, the Stanford has served as a benchmark in the field of deaf education for assessing student academic achievement. However, the…
Descriptors: Testing Programs, Educational Testing, Deafness, Academic Achievement
Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J. – Multivariate Behavioral Research, 2010
Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…
Descriptors: Educational Testing, Scores, Reports, Psychometrics
Scherrer, Jimmy – NASSP Bulletin, 2011
The use of value-added modeling (VAM) in school accountability is expanding. However, trying to decide how to embrace VAM can be rather nettlesome. Some experts claim it is "too unreliable," causes "more harm than good," and has "a big margin for error," while other experts assert VAM is "imperfect, but…
Descriptors: Teacher Effectiveness, Accountability, Inferences, Validity
Cizek, Gregory J. – Theory Into Practice, 2009
Reliability and validity are two characteristics that must be considered whenever information about student achievement is collected. However, those characteristics--and the methods for evaluating them--differ in large-scale testing and classroom testing contexts. This article presents the distinctions between reliability and validity in the two…
Descriptors: Academic Achievement, Validity, Measures (Individuals), Reliability
Sinharay, Sandip; Haberman, Shelby J. – Measurement: Interdisciplinary Research and Perspectives, 2009
In this commentary, the authors discuss some of the issues regarding the use of diagnostic classification models that practitioners should keep in mind. In the authors experience, these issues are not as well known as they should be. The authors then provide recommendations on diagnostic scoring.
Descriptors: Scoring, Reliability, Validity, Classification
Stobart, Gordon – Educational Research, 2009
Background: Validity is a central concern in any assessment, though this has often not been made explicit in the UK assessment context. This article applies current validity theorising, largely derived from American formulations, to national curriculum assessments in England. Purpose: The aim is to consider validity arguments in relation to the…
Descriptors: National Curriculum, Foreign Countries, Elementary Secondary Education, Educational Policy
Gray, B. Thomas – 1997
Validity is a critically important issue with far-reaching implications for testing. The history of conceptualizations of validity over the past 50 years is reviewed, and 3 important areas of controversy are examined. First, the question of whether the three traditionally recognized types of validity should be integrated as a unitary entity of…
Descriptors: Educational Testing, Evaluation Methods, Reliability, Scores
Haberman, Shelby J. – ETS Research Report Series, 2008
In educational testing, subscores may be provided based on a portion of the items from a larger test. One consideration in evaluation of such subscores is their ability to predict a criterion score. Two limitations on prediction exist. The first, which is well known, is that the coefficient of determination for linear prediction of the criterion…
Descriptors: Scores, Validity, Educational Testing, Correlation
Popham, W. James – Educational Research, 2009
Against a shifting set of assessment preferences in the US regarding whether educational assessment should continue to be a states rights game or become a federally dominated undertaking, the publication of five first-rate analyses about England's national curriculum assessment (NCA) is particularly propitious. Taken together, these five papers…
Descriptors: National Curriculum, States Powers, Educational Assessment, Foreign Countries
Denham, Thomas J. – 2002
This paper describes the Myers-Briggs Type Indicator (MBTI), developed by I. Myers and K. Briggs (1940s) to assess personality type. Based on Jungian theory, the MBTI has become a tool for identifying the 16 different patterns of action into which every person fits. The 16 personality types are based on patterns of: (1) extraversion-introversion;…
Descriptors: Educational Testing, Personality Assessment, Personality Measures, Personality Traits
American Educational Research Association, Washington, DC. – 1999
The standards outlined in this book have been developed to provide criteria for the evaluation of tests, testing practices, and the effects of test use. The "Standards" provides a frame of reference to ensure that relevant issues are addressed. The first part of the book, "Test Construction, Evaluation, and Documentation,"…
Descriptors: Educational Testing, Evaluation Methods, Psychological Testing, Reliability
Green, Sylvia; Oates, Tim – Educational Research, 2009
Background: In this article we address some of the challenges posed by the development of national assessment systems and discuss the need for high quality information on trends in attainment; support for school improvement processes and ways in which learning should be enhanced through valid assessment. Purpose: Key elements are explored,…
Descriptors: Educational Objectives, National Standards, Educational Quality, Educational Change
Peer reviewedMills, Janet – Bulletin of the Council for Research in Music Education, 1987
Questions the extent to which assessment of solo musical performance can be made under the General Certificate of School Education exam in England and Wales. Discusses performances as criterion. Reports on experiment which attempted to assess a student's overall music performance. Offers a model which can be used to better measure solo music…
Descriptors: Educational Research, Educational Testing, Foreign Countries, Interrater Reliability
Previous Page | Next Page ยป
Pages: 1 | 2
Direct link
