Publication Date
In 2025 | 3 |
Since 2024 | 12 |
Since 2021 (last 5 years) | 41 |
Since 2016 (last 10 years) | 126 |
Since 2006 (last 20 years) | 395 |
Descriptor
Test Theory | 1161 |
Test Items | 261 |
Test Reliability | 252 |
Test Construction | 245 |
Test Validity | 245 |
Psychometrics | 181 |
Scores | 176 |
Item Response Theory | 165 |
Foreign Countries | 159 |
Item Analysis | 141 |
Statistical Analysis | 134 |
More ▼ |
Source
Author
Publication Type
Education Level
Location
United States | 17 |
United Kingdom (England) | 15 |
Canada | 14 |
Australia | 13 |
Turkey | 12 |
Sweden | 8 |
United Kingdom | 8 |
Netherlands | 7 |
Texas | 7 |
New York | 6 |
Taiwan | 6 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 4 |
Elementary and Secondary… | 3 |
Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Schumacker, Randall E. – 1998
In comparing measurement theories, it is evident that the awareness of the concept of measurement error during the time of Galileo has lead to the formulation of observed scores comprising a true score and error (classical theory), universe score and various random error components (generalizability theory), or individual latent ability and error…
Descriptors: Comparative Analysis, Computer Software, Error of Measurement, Generalizability Theory
Haberman, Shelby J. – ETS Research Report Series, 2005
In educational tests, subscores are often generated from a portion of the items in a larger test. Guidelines based on mean-squared error are proposed to indicate whether subscores are worth reporting. Alternatives considered are direct reports of subscores, estimates of subscores based on total score, combined estimates based on subscores and…
Descriptors: Scores, Test Items, Error of Measurement, Computation
Leonard, Tom; Novick, Melvin R. – 1985
This proposal attempts to follow in Allan Birnbaum's tradition by using Bayesian ideas to show that his mental test model possesses even broader applicability than previously realized. Birnbaum's two significant contributions to the theories of statistics and educational testing are: (1) the proof that the sufficiency and conditionality principles…
Descriptors: Bayesian Statistics, Cognitive Measurement, Estimation (Mathematics), Latent Trait Theory
Reckase, Mark D.; McKinley, Robert L. – 1984
The purpose of this paper is to present a generalization of the concept of item difficulty to test items that measure more than one dimension. Three common definitions of item difficulty were considered: the proportion of correct responses for a group of individuals; the probability of a correct response to an item for a specific person; and the…
Descriptors: Difficulty Level, Item Analysis, Latent Trait Theory, Mathematical Models
Cliff, Norman – 1984
In almost all applications of measurement there is some sort of response by a human subject. Almost always, the response scale is ordinal, but almost always it is treated as if it were an interval measure. Methods for treating data ordinally are currently being developed in three areas: ordinal analysis for questionnaire responses, ordinal…
Descriptors: Multiple Regression Analysis, Questionnaires, Research Problems, Scores
Downing, Steven M.; Mehrens, William A. – 1978
Four criterion-referenced reliability coefficicents were compared to the Kuder-Richardson estimates and to each other. The Kuder-Richardson formulas 20 and 21, the Livingston, the Subkoviak and two Huynh coefficients were computed for a random sample of 33 criterion-referenced tests. The Subkoviak coefficient yielded the highest mean value;…
Descriptors: Career Development, Comparative Analysis, Criterion Referenced Tests, Factor Analysis

Huynh, Huynh – 1979
A general framework for making mastery/nonmastery decisions based on multivariate test data is described in this study. Over all, mastery is granted (or denied) if the posterior expected loss associated with such action is smaller than the one incurred by the denial (or grant) of mastery. An explicit form for the cutting contour which separates…
Descriptors: Bayesian Statistics, Cutting Scores, Error of Measurement, Mastery Tests
Thorndike, Robert L. – 1980
In an invitational address to the Victorian Institute of Educational Research, the author discussed Bayesian theory and its relationship to the design and construction of tailored or adaptive tests. Bayesian thinking involves recognizing the role of prior probabilities and using these probabilities in combination with new data to arrive at future…
Descriptors: Adaptive Testing, Bayesian Statistics, Computer Assisted Testing, Error of Measurement

Zimmerman, Donald W. – Educational and Psychological Measurement, 1976
Using the concepts of conditional probability, conditional expectation, and conditional independence, the main results of the classical test theory model can be derived in a very few steps with minimal assumptions. The present effort explores the possibility that present classical test theories can be further condensed. (Author/RC)
Descriptors: Career Development, Correlation, Mathematical Models, Measurement

Hattie, John; Rogers, H. Jane – Journal of Educational Psychology, 1986
This article demonstrates that the usual first-order factor model is inappropriate for analyzing the factor structure of creativity and intelligence tests. An alternative model that allows for the estimation of unique covariance between the fluency and originality scores is proposed. (Author/JAZ)
Descriptors: Achievement Tests, Creativity Tests, Factor Analysis, Goodness of Fit

Yen, Wendy M. – Journal of Educational Measurement, 1986
Two methods of constucting equal-interval scales for educational achievement are discussed: Thurstone's absolute scaling method and Item Response Theory. Alternative criteria for choosing a scale are contrasted. It is argued that clearer criteria are needed for judging the appropriateness and usefulness of alternative scaling procedures.…
Descriptors: Achievement Tests, Latent Trait Theory, Mathematical Models, Scaling

Stevens, Joseph J.; Aleamoni, Lawrence, M. – Educational and Psychological Measurement, 1986
Prior standardization of scores when an aggregate score is formed has been criticized. This article presents a demonstration of the effects of differential weighting of aggregate components that clarifies the need for prior standardization. The role of standardization in statistics and the use of aggregate scores in research are discussed.…
Descriptors: Correlation, Error of Measurement, Factor Analysis, Raw Scores

Andersen, Erling B. – Psychometrika, 1985
A model for longitudinal latent structure analysis was proposed that combined the values of a latent variable at two time points in a two-dimensional latent density. The correlation coefficient between the two values of the latent variable can then be estimated. (NSF)
Descriptors: Correlation, Latent Trait Theory, Mathematical Models, Maximum Likelihood Statistics

Purves, Alan C. – Language Arts, 1986
Explores the three broad thrusts of literature curricula, noting that no test can cover all three. Discusses how the content and objectives of the literature curriculum can be specified and how test questions can be developed for the evaluation of literature comprehension. (HTH)
Descriptors: Curriculum Problems, Elementary Education, Language Arts, Literature Appreciation

Yen, Wendy M. – Psychometrika, 1983
Tau-equivalence means that two tests produce equal true scores for individuals but that the distribution of errors for the tests could be different. This paper examines the effect of performing equipercentile equating techniques on tau-equivalent tests. (JKS)
Descriptors: Equated Scores, Latent Trait Theory, Psychometrics, Scores