Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 1 |
Descriptor
Author
Albanese, Mark A. | 1 |
Bourque, Mary Lyn | 1 |
Eiting, Mindert H. | 1 |
Fendler, Lynn | 1 |
Hsiung, Chao A. | 1 |
Kane, Thomas J. | 1 |
Lin, Miao-Hsiang | 1 |
Meijer, Rob R. | 1 |
Shavelson, Richard J. | 1 |
Skaggs, Gary | 1 |
Staiger, Douglas O. | 1 |
More ▼ |
Publication Type
Reports - Evaluative | 8 |
Journal Articles | 4 |
Speeches/Meeting Papers | 2 |
Audience
Location
California | 1 |
North Carolina | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Graduate Management Admission… | 1 |
National Assessment of… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Fendler, Lynn – Ethics and Education, 2016
In educational research that calls itself empirical, the relationship between validity and reliability is that of trade-off: the stronger the bases for validity, the weaker the bases for reliability (and vice versa). Validity and reliability are widely regarded as basic criteria for evaluating research; however, there are ethical implications of…
Descriptors: Educational Research, Ethics, Test Validity, Test Reliability

Lin, Miao-Hsiang; Hsiung, Chao A. – Psychometrika, 1992
Four bootstrap methods are identified for constructing confidence intervals for the binomial-error model. The extent to which similar results are obtained and the theoretical foundation of each method and its relevance and ranges of modeling the true score uncertainty are discussed. (SLD)
Descriptors: Bayesian Statistics, Computer Simulation, Equations (Mathematics), Estimation (Mathematics)

Eiting, Mindert H. – Applied Psychological Measurement, 1991
A method is proposed for sequential evaluation of reliability of psychometric instruments. Sample size is unfixed; a test statistic is computed after each person is sampled and a decision is made in each stage of the sampling process. Results from a series of Monte-Carlo experiments establish the method's efficiency. (SLD)
Descriptors: Computer Simulation, Equations (Mathematics), Estimation (Mathematics), Mathematical Models
Meijer, Rob R.; And Others – 1994
Three methods for the estimation of the reliability of single dichotomous items are discussed. All methods are based on the assumptions of nondecreasing and nonintersecting item response functions and the Mokken model of double monotonicity. Based on analytical and Monte Carlo studies, it is concluded that one method is superior to the other two…
Descriptors: Estimation (Mathematics), Foreign Countries, Item Response Theory, Monte Carlo Methods
Skaggs, Gary; Bourque, Mary Lyn – 1998
Political and legislative pressures have posed a number of measurement issues and challenges to the development of sound, valid voluntary national tests (VNTs). This paper focuses on what appear to be the most difficult technical issues related to the VNT proposed by President Clinton in 1997. Technical issues refer to psychometric issues, as…
Descriptors: Academic Achievement, Achievement Tests, Classification, Difficulty Level
Kane, Thomas J.; Staiger, Douglas O. – Brookings Papers on Education Policy, 2002
By the spring of 2000, forty states had begun using student test scores to rate school performance. Twenty states have gone a step further and are attaching explicit monetary rewards or sanctions to a school's test performance. In this paper, the authors focus on accountability programs in which states measure the effectiveness of individual…
Descriptors: Elementary Schools, Accountability, Scores, Risk
Albanese, Mark A. – 1985
This study reexamines results reported by Angoff and Schrader regarding formula directions and rights directions for standardized tests. In that study, it was concluded that the two scoring directions were essentially equivalent. In this study, methodological concerns are discussed and additional data analyses undertaken. Among various…
Descriptors: College Entrance Examinations, Data Interpretation, Fatigue (Biology), Guessing (Tests)
Shavelson, Richard J.; And Others – 1993
In this paper, performance assessments are cast within a sampling framework. A performance assessment score is viewed as a sample of student performance drawn from a complex universe defined by a combination of all possible tasks, occasions, raters, and measurement methods. Using generalizability theory, the authors present evidence bearing on the…
Descriptors: Academic Achievement, Educational Assessment, Error of Measurement, Evaluators