Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 16 |
Descriptor
Test Interpretation | 112 |
Test Theory | 112 |
Test Validity | 34 |
Test Construction | 33 |
Test Reliability | 32 |
Scores | 26 |
Testing Problems | 24 |
Criterion Referenced Tests | 21 |
Measurement Techniques | 17 |
Psychometrics | 17 |
Standardized Tests | 17 |
More ▼ |
Source
Author
Powell, J. C. | 3 |
Tatsuoka, Kikumi K. | 3 |
Cliff, Norman | 2 |
Crocker, Linda | 2 |
Haladyna, Tom | 2 |
Hubley, Anita M. | 2 |
Lord, Frederic M. | 2 |
Lyon, Mark A. | 2 |
Mislevy, Robert J. | 2 |
Price, Gary G. | 2 |
Santee, Phillip | 2 |
More ▼ |
Publication Type
Education Level
Elementary Secondary Education | 6 |
Higher Education | 2 |
Postsecondary Education | 1 |
Location
United Kingdom | 4 |
Australia | 3 |
Canada | 3 |
United Kingdom (Great Britain) | 3 |
United States | 3 |
United Kingdom (England) | 2 |
New York | 1 |
New York (New York) | 1 |
Taiwan | 1 |
USSR | 1 |
United Kingdom (Wales) | 1 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 1 |
Individuals with Disabilities… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Sophie Litschwartz – Society for Research on Educational Effectiveness, 2021
Background/Context: Pass/fail standardized exams frequently selectively rescore failing exams and retest failing examinees. This practice distorts the test score distribution and can confuse those who do analysis on these distributions. In 2011, the Wall Street Journal showed large discontinuities in the New York City Regent test score…
Descriptors: Standardized Tests, Pass Fail Grading, Scoring Rubrics, Scoring Formulas
Gafni, Naomi – Assessment in Education: Principles, Policy & Practice, 2016
Naomi Gafni, director of Research and Development, National Institute for Testing and Evaluation, Jerusalem, Israel, has devoted a substantial part of her career to the development of admissions tests and other educational tests and to the investigation of their validity. As such she is keenly aware of the complexities involved in this process.…
Descriptors: Test Validity, Test Interpretation, Test Use, Test Construction
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2014
Brennan (Brennan, R. L., 2012) noted that users of test scores often want (indeed, demand) that subscores be reported, along with total test scores, for diagnostic purposes. Haberman (Haberman, S. J., 2008) suggested a method based on classical test theory (CTT) to determine if subscores have added value over the total score. According to this…
Descriptors: Scores, Test Theory, Test Interpretation
Powell, J. C. – International Association for Development of the Information Society, 2013
This reflection paper challenges current test scoring practices on the grounds that most wrong-answer selections are thoughtful not random, presenting research supporting this proposition. An alternative test scoring system is presented, described and its outcomes discussed. This new scoring system increases the number of variables considered,…
Descriptors: Test Theory, Test Interpretation, Scoring, Multiple Choice Tests
Fan, Xitao; Sun, Shaojing – Journal of Early Adolescence, 2014
In adolescence research, the treatment of measurement reliability is often fragmented, and it is not always clear how different reliability coefficients are related. We show that generalizability theory (G-theory) is a comprehensive framework of measurement reliability, encompassing all other reliability methods (e.g., Pearson "r,"…
Descriptors: Generalizability Theory, Measurement, Reliability, Correlation
Dorans, Neil J. – Educational Measurement: Issues and Practice, 2012
Views on testing--its purpose and uses and how its data are analyzed--are related to one's perspective on test takers. Test takers can be viewed as learners, examinees, or contestants. I briefly discuss the perspective of test takers as learners. I maintain that much of psychometrics views test takers as examinees. I discuss test takers as a…
Descriptors: Testing, Test Theory, Item Response Theory, Test Reliability
Yorke, Mantz; Orr, Susan; Blair, Bernadette – Studies in Higher Education, 2014
There has long been the suspicion amongst staff in Art & Design that the ratings given to their subject disciplines in the UK's National Student Survey are adversely affected by a combination of circumstances--a "perfect storm". The "perfect storm" proposition is tested by comparing ratings for Art & Design with those…
Descriptors: Student Surveys, National Surveys, Art Education, Design
Hubley, Anita M.; Zumbo, Bruno D. – Social Indicators Research, 2011
The vast majority of measures have, at their core, a purpose of personal and social change. If test developers and users want measures to have personal and social consequences and impact, then it is critical to consider the consequences and side effects of measurement in the validation process itself. The consequential basis of test interpretation…
Descriptors: Construct Validity, Social Change, Measurement, Test Interpretation
Cresswell, Mike – Measurement: Interdisciplinary Research and Perspectives, 2010
Paul Newton (2010), with his characteristic concern about theory, has set out two different ways of thinking about the basis upon which equivalences of one sort or another are established between test score scales. His reason for doing this is a desire to establish "the defensibility of linkages lower on the continuum than concordance."…
Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis
Elosua, Paula; Iliescu, Dragos – International Journal of Testing, 2012
Psychometric practice does not always converge with the advances of psychometric theory. In order to investigate this gap, the authors focus on the 10 most used psychological tests in Europe, as identified by recent surveys. The article analyzes test manuals published in 6 different European countries for these 10 most used tests. A total of 32…
Descriptors: Psychological Testing, Personality Measures, Error of Measurement, Foreign Countries
Newton, Paul E. – Measurement: Interdisciplinary Research and Perspectives, 2010
This article presents the author's rejoinder to thinking about linking from issue 8(1). Particularly within the more embracing linking frameworks, e.g., Holland & Dorans (2006) and Holland (2007), there appears to be a major disjunction between (1) classification discourse: the supposed basis for classification, that is, the underlying theory…
Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis
Chen, Yi-Hsin; Gorin, Joanna S.; Thompson, Marilyn S.; Tatsuoka, Kikumi K. – International Journal of Testing, 2008
As with any test administered across linguistically and culturally diverse groups, evidence suggesting the equivalence of score meaning across countries is needed for valid comparisons. The current study examines the cross-cultural equivalence of score interpretations from the Trends in International Mathematics and Science Study (TIMSS)-1999 from…
Descriptors: Construct Validity, Mathematics Tests, Foreign Countries, Equated Scores
Baird, Jo-Anne – Measurement: Interdisciplinary Research and Perspectives, 2010
Newton's article (2010) makes three main contributions to the literature. First, it is transatlantic, bringing together literatures that have been dealing with similar problems, using sometimes different methods and certainly with distinctive educational, cultural perspectives. He points out that neither of these literatures has all of the…
Descriptors: Foreign Countries, Predictive Validity, Standards, Ethics
von Davier, Alina A. – Measurement: Interdisciplinary Research and Perspectives, 2010
The article "Thinking About Linking" by Newton (2010) presents a novel philosophical perspective on the way that educational assessments should be linked. Newton starts by describing the linking framework as it was characterized in various publications and identifies a cross-cultural dimension in the definitions and uses of test…
Descriptors: Foreign Countries, Educational Assessment, Student Evaluation, Evaluation Criteria
Woolley, Kristin K. – 1996
The theory of score validity has undergone several revisions within the measurement community. The current consensus among professionals is a rejection of the trinitarian doctrine (J. P. Guion, 1980) of score validity and the recognition of a unified view that includes social consequences of test interpretation and use. While some aspects of the…
Descriptors: Models, Scores, Standards, Test Interpretation