Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 12 |
Descriptor
Test Theory | 93 |
Testing Problems | 93 |
Test Construction | 33 |
Test Reliability | 25 |
Test Validity | 25 |
Test Interpretation | 24 |
Test Items | 24 |
Measurement Techniques | 21 |
Test Use | 21 |
Educational Testing | 18 |
Criterion Referenced Tests | 16 |
More ▼ |
Source
Author
Publication Type
Audience
Researchers | 10 |
Practitioners | 5 |
Teachers | 2 |
Counselors | 1 |
Students | 1 |
Location
United Kingdom | 4 |
United Kingdom (England) | 3 |
United States | 3 |
Canada | 2 |
Netherlands | 2 |
United Kingdom (Wales) | 2 |
Australia | 1 |
Israel | 1 |
Sweden | 1 |
Texas | 1 |
United Kingdom (Northern… | 1 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Salmani Nodoushan, Mohammad Ali – Online Submission, 2021
This paper follows a line of logical argumentation to claim that what Samuel Messick conceptualized about construct validation has probably been misunderstood by some educational policy makers, practicing educators, and classroom teachers. It argues that, while Messick's unified theory of test validation aimed at (a) warning educational…
Descriptors: Construct Validity, Test Theory, Test Use, Affordances
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Kettler, Ryan J. – Review of Research in Education, 2015
This chapter introduces theory that undergirds the role of testing adaptations in assessment, provides examples of item modifications and testing accommodations, reviews research relevant to each, and introduces a new paradigm that incorporates opportunity to learn (OTL), academic enablers, testing adaptations, and inferences that can be made from…
Descriptors: Meta Analysis, Literature Reviews, Testing, Testing Accommodations
Saladin, Shawn P.; Reid, Christine; Shiels, John – Rehabilitation Research, Policy, and Education, 2011
The Commission on Rehabilitation Counselor Certification (CRCC) has taken a proactive stance on perceived test inequities of the Certified Rehabilitation Counselor (CRC) exam as it relates to people who are prelingually deaf and hard of hearing. This article describes the process developed and implemented by the CRCC to help maximize test equity…
Descriptors: Test Items, Rehabilitation Counseling, Counselor Certification, Deafness
van Rijn, P. W.; Beguin, A. A.; Verstralen, H. H. F. M. – Assessment in Education: Principles, Policy & Practice, 2012
While measurement precision is relatively easy to establish for single tests and assessments, it is much more difficult to determine for decision making with multiple tests on different subjects. This latter is the situation in the system of final examinations for secondary education in the Netherlands and is used as an example in this paper. This…
Descriptors: Secondary Education, Tests, Foreign Countries, Decision Making
Cresswell, Mike – Measurement: Interdisciplinary Research and Perspectives, 2010
Paul Newton (2010), with his characteristic concern about theory, has set out two different ways of thinking about the basis upon which equivalences of one sort or another are established between test score scales. His reason for doing this is a desire to establish "the defensibility of linkages lower on the continuum than concordance."…
Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis
Solano-Flores, Guillermo; Backhoff, Eduardo; Contreras-Nino, Luis Angel – International Journal of Testing, 2009
In this article, we present a theory of test translation whose intent is to provide the conceptual foundation for effective, systematic work in the process of test translation and test translation review. According to the theory, translation error is multidimensional; it is not simply the consequence of defective translation but an inevitable fact…
Descriptors: Test Items, Investigations, Semantics, Translation
Newton, Paul E. – Measurement: Interdisciplinary Research and Perspectives, 2010
This article presents the author's rejoinder to thinking about linking from issue 8(1). Particularly within the more embracing linking frameworks, e.g., Holland & Dorans (2006) and Holland (2007), there appears to be a major disjunction between (1) classification discourse: the supposed basis for classification, that is, the underlying theory…
Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis
Baird, Jo-Anne – Measurement: Interdisciplinary Research and Perspectives, 2010
Newton's article (2010) makes three main contributions to the literature. First, it is transatlantic, bringing together literatures that have been dealing with similar problems, using sometimes different methods and certainly with distinctive educational, cultural perspectives. He points out that neither of these literatures has all of the…
Descriptors: Foreign Countries, Predictive Validity, Standards, Ethics
von Davier, Alina A. – Measurement: Interdisciplinary Research and Perspectives, 2010
The article "Thinking About Linking" by Newton (2010) presents a novel philosophical perspective on the way that educational assessments should be linked. Newton starts by describing the linking framework as it was characterized in various publications and identifies a cross-cultural dimension in the definitions and uses of test…
Descriptors: Foreign Countries, Educational Assessment, Student Evaluation, Evaluation Criteria
van der Linden, Wim J.; Breithaupt, Krista; Chuah, Siang Chee; Zhang, Yanwei – Journal of Educational Measurement, 2007
A potential undesirable effect of multistage testing is differential speededness, which happens if some of the test takers run out of time because they receive subtests with items that are more time intensive than others. This article shows how a probabilistic response-time model can be used for estimating differences in time intensities and speed…
Descriptors: Adaptive Testing, Evaluation Methods, Test Items, Reaction Time
Engelhard, George, Jr. – 1988
The purpose of this essay is to describe the principles of educational measurement proposed by B. Wood during the 1920s in his dissertation, written under the direction of E. L. Thorndike, and later published as "Measurement in Higher Education" (1923). These principles were selected because they illustrate one of the earliest and most complete…
Descriptors: Educational History, Educational Testing, Test Theory, Testing Problems
Helms, LuAnn Sherbeck – 1999
This paper discusses the fact that reliability is about scores and not tests and how reliability limits effect sizes. The paper also explores the classical reliability coefficients of stability, equivalence, and internal consistency. Stability is concerned with how stable test scores will be over time, while equivalence addresses the relationship…
Descriptors: Effect Size, Meta Analysis, Reliability, Scores

Secolsky, Charles – Journal of Educational Measurement, 1987
For measuring the face validity of a test, Nevo suggested that test takers and nonprofessional users rate items on a five point scale. This article questions the ability of those raters and the credibility of the aggregated judgment as evidence of the validity of the test. (JAZ)
Descriptors: Content Validity, Measurement Techniques, Rating Scales, Test Items

Yarnold, Paul R. – Educational and Psychological Measurement, 1984
Unreliable profiles impose the difficulty that ordinal and interval relations among the individual's scores become uncertain or unstable. A profile reliability coefficient is derived to estimate the relative expected extent of this ordinal and interval "inversion" for any profile of K measures. (Author/DWH)
Descriptors: Error of Measurement, Mathematical Models, Profiles, Test Reliability