Publication Date
| In 2026 | 0 |
| Since 2025 | 8 |
| Since 2022 (last 5 years) | 36 |
| Since 2017 (last 10 years) | 115 |
| Since 2007 (last 20 years) | 378 |
Descriptor
| Test Theory | 1166 |
| Test Items | 262 |
| Test Reliability | 252 |
| Test Construction | 246 |
| Test Validity | 245 |
| Psychometrics | 183 |
| Scores | 176 |
| Item Response Theory | 168 |
| Foreign Countries | 160 |
| Item Analysis | 141 |
| Statistical Analysis | 134 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Location
| United States | 17 |
| United Kingdom (England) | 15 |
| Canada | 14 |
| Australia | 13 |
| Turkey | 12 |
| Sweden | 8 |
| United Kingdom | 8 |
| Netherlands | 7 |
| Texas | 7 |
| New York | 6 |
| Taiwan | 6 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 4 |
| Elementary and Secondary… | 3 |
| Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Haberman, Shelby J. – ETS Research Report Series, 2005
In educational tests, subscores are often generated from a portion of the items in a larger test. Guidelines based on mean-squared error are proposed to indicate whether subscores are worth reporting. Alternatives considered are direct reports of subscores, estimates of subscores based on total score, combined estimates based on subscores and…
Descriptors: Scores, Test Items, Error of Measurement, Computation
Leonard, Tom; Novick, Melvin R. – 1985
This proposal attempts to follow in Allan Birnbaum's tradition by using Bayesian ideas to show that his mental test model possesses even broader applicability than previously realized. Birnbaum's two significant contributions to the theories of statistics and educational testing are: (1) the proof that the sufficiency and conditionality principles…
Descriptors: Bayesian Statistics, Cognitive Measurement, Estimation (Mathematics), Latent Trait Theory
Reckase, Mark D.; McKinley, Robert L. – 1984
The purpose of this paper is to present a generalization of the concept of item difficulty to test items that measure more than one dimension. Three common definitions of item difficulty were considered: the proportion of correct responses for a group of individuals; the probability of a correct response to an item for a specific person; and the…
Descriptors: Difficulty Level, Item Analysis, Latent Trait Theory, Mathematical Models
Cliff, Norman – 1984
In almost all applications of measurement there is some sort of response by a human subject. Almost always, the response scale is ordinal, but almost always it is treated as if it were an interval measure. Methods for treating data ordinally are currently being developed in three areas: ordinal analysis for questionnaire responses, ordinal…
Descriptors: Multiple Regression Analysis, Questionnaires, Research Problems, Scores
Downing, Steven M.; Mehrens, William A. – 1978
Four criterion-referenced reliability coefficicents were compared to the Kuder-Richardson estimates and to each other. The Kuder-Richardson formulas 20 and 21, the Livingston, the Subkoviak and two Huynh coefficients were computed for a random sample of 33 criterion-referenced tests. The Subkoviak coefficient yielded the highest mean value;…
Descriptors: Career Development, Comparative Analysis, Criterion Referenced Tests, Factor Analysis
PDF pending restorationHuynh, Huynh – 1979
A general framework for making mastery/nonmastery decisions based on multivariate test data is described in this study. Over all, mastery is granted (or denied) if the posterior expected loss associated with such action is smaller than the one incurred by the denial (or grant) of mastery. An explicit form for the cutting contour which separates…
Descriptors: Bayesian Statistics, Cutting Scores, Error of Measurement, Mastery Tests
Thorndike, Robert L. – 1980
In an invitational address to the Victorian Institute of Educational Research, the author discussed Bayesian theory and its relationship to the design and construction of tailored or adaptive tests. Bayesian thinking involves recognizing the role of prior probabilities and using these probabilities in combination with new data to arrive at future…
Descriptors: Adaptive Testing, Bayesian Statistics, Computer Assisted Testing, Error of Measurement
Peer reviewedZimmerman, Donald W. – Educational and Psychological Measurement, 1976
Using the concepts of conditional probability, conditional expectation, and conditional independence, the main results of the classical test theory model can be derived in a very few steps with minimal assumptions. The present effort explores the possibility that present classical test theories can be further condensed. (Author/RC)
Descriptors: Career Development, Correlation, Mathematical Models, Measurement
Peer reviewedHattie, John; Rogers, H. Jane – Journal of Educational Psychology, 1986
This article demonstrates that the usual first-order factor model is inappropriate for analyzing the factor structure of creativity and intelligence tests. An alternative model that allows for the estimation of unique covariance between the fluency and originality scores is proposed. (Author/JAZ)
Descriptors: Achievement Tests, Creativity Tests, Factor Analysis, Goodness of Fit
Peer reviewedYen, Wendy M. – Journal of Educational Measurement, 1986
Two methods of constucting equal-interval scales for educational achievement are discussed: Thurstone's absolute scaling method and Item Response Theory. Alternative criteria for choosing a scale are contrasted. It is argued that clearer criteria are needed for judging the appropriateness and usefulness of alternative scaling procedures.…
Descriptors: Achievement Tests, Latent Trait Theory, Mathematical Models, Scaling
Peer reviewedStevens, Joseph J.; Aleamoni, Lawrence, M. – Educational and Psychological Measurement, 1986
Prior standardization of scores when an aggregate score is formed has been criticized. This article presents a demonstration of the effects of differential weighting of aggregate components that clarifies the need for prior standardization. The role of standardization in statistics and the use of aggregate scores in research are discussed.…
Descriptors: Correlation, Error of Measurement, Factor Analysis, Raw Scores
Peer reviewedAndersen, Erling B. – Psychometrika, 1985
A model for longitudinal latent structure analysis was proposed that combined the values of a latent variable at two time points in a two-dimensional latent density. The correlation coefficient between the two values of the latent variable can then be estimated. (NSF)
Descriptors: Correlation, Latent Trait Theory, Mathematical Models, Maximum Likelihood Statistics
Peer reviewedPurves, Alan C. – Language Arts, 1986
Explores the three broad thrusts of literature curricula, noting that no test can cover all three. Discusses how the content and objectives of the literature curriculum can be specified and how test questions can be developed for the evaluation of literature comprehension. (HTH)
Descriptors: Curriculum Problems, Elementary Education, Language Arts, Literature Appreciation
Peer reviewedYen, Wendy M. – Psychometrika, 1983
Tau-equivalence means that two tests produce equal true scores for individuals but that the distribution of errors for the tests could be different. This paper examines the effect of performing equipercentile equating techniques on tau-equivalent tests. (JKS)
Descriptors: Equated Scores, Latent Trait Theory, Psychometrics, Scores
Peer reviewedBieliauskas, Vytautas J.; Farragher, John – Journal of Clinical Psychology, 1983
Administered the House-Tree-Person test to male college students (N=24) to examine the effects of varying the size of the drawing form on the scores. Results suggested that use of the drawing sheet did not have a significant influence upon the quantitative aspects of the drawing. (LLL)
Descriptors: College Students, Higher Education, Intelligence Tests, Males


