Publication Date
| In 2026 | 0 |
| Since 2025 | 2 |
| Since 2022 (last 5 years) | 12 |
| Since 2017 (last 10 years) | 26 |
| Since 2007 (last 20 years) | 90 |
Descriptor
| True Scores | 416 |
| Error of Measurement | 121 |
| Test Reliability | 110 |
| Statistical Analysis | 107 |
| Mathematical Models | 97 |
| Item Response Theory | 87 |
| Correlation | 76 |
| Equated Scores | 76 |
| Reliability | 64 |
| Test Theory | 52 |
| Test Items | 51 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 12 |
| Practitioners | 2 |
| Administrators | 1 |
| Teachers | 1 |
Location
| Australia | 1 |
| Canada | 1 |
| China | 1 |
| Colorado | 1 |
| Illinois | 1 |
| Israel | 1 |
| New York | 1 |
| Oregon | 1 |
| Taiwan | 1 |
| Texas | 1 |
| United Kingdom (England) | 1 |
| More ▼ | |
Laws, Policies, & Programs
| Elementary and Secondary… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Egelston, Richard L.; Heinegg, Rosemarie – 1984
The stability of annual normal curve equivalent (NCE) on the Iowa Test of Basic Skills over a period of six years was established in a previous study of status (position within a reference group). This study investigated the relationship between status stability and student achievement level. Third grade students in a suburban school district who…
Descriptors: Academic Achievement, Achievement Tests, Basic Skills, Cohort Analysis
Wilcox, Rand R. – 1977
Three statistical problems related to criterion-referenced testing are investigated: estimation of the likelihood of a false-positive or false-negative decision with a mastery test, estimation of true scores in the Compound Binomial Error Model, and comparison of the examinees to a control. Two methods for estimating the likelihood of…
Descriptors: Criterion Referenced Tests, Cutting Scores, Error Patterns, Item Sampling
PDF pending restorationJovick, Thomas D. – 1979
The federally funded longitudinal field study called Management Implications of Team Teaching (MITT) required a search for an appropriate strategy for analyzing through-time relationships among selected variables. The MITT project used questionnaires and interviews to collect data concerning the work, governance, attitudes, and orientation of…
Descriptors: Analysis of Variance, Data Analysis, Elementary Education, Field Studies
PDF pending restorationMarco, Gary L.; And Others – 1979
Data from the verbal portion of the College Entrance Examination Board Scholastic Aptitude Tests were used in an experimental test of the accuracy of equating for a variety of models in three categories: linear equating, equipercentile equating, and item characteristic curve equating. The models were tested for both mean squared error and bias.…
Descriptors: Aptitude Tests, Equated Scores, Error of Measurement, High Schools
Morse, David T.; Morse, Linda W. – 1976
Performance testing often entails the usage of expensive, time-consuming measures in the quest for determining the level of performance on some desired behavior. It is concluded that a generalizability theory approach to dealing with departures from reality in testing can aid in the establishment of empirically-based choices of measurement…
Descriptors: Cost Effectiveness, Decision Making, Mathematical Models, Measurement Techniques
Moore, William E. – 1970
The previous theoretical development of the Poisson process as a strong model for the true-score theory of mental tests is discussed, and additional theoretical properties of the model from the standpoint of individual examinees are developed. The paper introduces the Erlang process as a family of test theory models and shows in the context of…
Descriptors: Arithmetic, Goodness of Fit, Grade 10, Mathematical Models
Li, Yuan H. – 2001
The primary objective of this study was to examine the construct validity of two multiple-content testing programs, the multiple-choice Comprehensive Tests of Basic Skills (CTBS/5) and the performance-based Maryland School Performance Assessment Program (MSPAP), by evaluating the true-score longitudinal associations among multiple-content scores…
Descriptors: Achievement Tests, Construct Validity, Correlation, Elementary Education
Kriewall, Thomas E. – Illinois School Research, 1972
Author discusses and defines criterion tests in the context of classroom needs that have created much of the interest in the theory at this time. The primary source of interest is related to the growing implementation of individualized curricula. (Author/CB)
Descriptors: Criterion Referenced Tests, Difficulty Level, Individualized Instruction, Item Analysis
Peer reviewedWerts, Charles E.; Linn, Robert L. – Educational and Psychological Measurement, 1971
Descriptors: Analysis of Covariance, Correlation, Educational Environment, Error of Measurement
Peer reviewedMellenbergh, Gideon J.; van der Linden, Wim J. – Applied Psychological Measurement, 1979
For six tests, coefficient delta as an index for internal optimality is computed. Internal optimality is defined as the magnitude of risk of the decision procedure with respect to the true score. Results are compared with an alternative index (coefficient kappa) for assessing the consistency of decisions. (Author/JKS)
Descriptors: Classification, Comparative Analysis, Decision Making, Error of Measurement
Peer reviewedBrennan, Robert L. – Journal of Educational Statistics, 1991
The monograph by D. Rogosa and G. Ghandour represents a body of cohesive and comprehensive research that can be the basis of a new measurement theory combining features of generalizability theory and strong true-score theory. Principles, approaches, arguments, and conclusions are reviewed; and critical comments are offered. (SLD)
Descriptors: Behavior Patterns, Behavioral Science Research, Classroom Observation Techniques, Elementary Secondary Education
Peer reviewedRogosa, David; Ghandour, Ghassan – Journal of Educational Statistics, 1991
Issues raised with the statistical models developed are discussed point by point, restating the emphasis on finite observation time, and reiterating the criticism of traditional psychometric methods. It is noted that the language and technical formulation of psychometrics can be extremely awkward in dealing with biased estimates. (SLD)
Descriptors: Behavior Patterns, Behavioral Science Research, Classroom Observation Techniques, Elementary Secondary Education
Peer reviewedKopriva, Rebecca J.; Shaw, Dale G. – Educational and Psychological Measurement, 1991
The degree to which reliability affects the power of analysis of variance (ANOVA) tests involving one factor with two and three samples was quantified and tabulated by taking into account sample size, level of significance, and true score effect size. Results confirm a substantial effect on power. (SLD)
Descriptors: Analysis of Variance, Effect Size, Equations (Mathematics), Estimation (Mathematics)
Cizek, Gregory J.; Husband, Timothy H. – 1997
The contrasting groups method is one of many possible methods for setting passing scores. The most commonly used method is probably that developed by W. H. Angoff (1971), but it has been suggested that the Angoff method may not be appropriate for many standard setting applications in education. The contrasting groups method is explored as an…
Descriptors: Cutting Scores, Educational Research, Educational Testing, Judges
Young, Michael James; Yoon, Bokhee – 1998
An important feature of recent large-scale performance assessments has been the reporting of pupil and school performance in terms of performance or proficiency categories. When an assessment uses such ordered categories as the primary means of reporting results, the natural way of reporting on the quality of the assessment is through the…
Descriptors: Academic Achievement, Academic Standards, Classification, Criterion Referenced Tests


