Publication Date
In 2025 | 39 |
Since 2024 | 192 |
Since 2021 (last 5 years) | 495 |
Since 2016 (last 10 years) | 996 |
Since 2006 (last 20 years) | 2028 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Researchers | 93 |
Practitioners | 23 |
Teachers | 22 |
Policymakers | 10 |
Administrators | 5 |
Students | 4 |
Counselors | 2 |
Parents | 2 |
Community | 1 |
Location
United States | 47 |
Germany | 42 |
Australia | 34 |
Canada | 27 |
Turkey | 27 |
California | 22 |
United Kingdom (England) | 20 |
Netherlands | 18 |
China | 16 |
New York | 15 |
United Kingdom | 15 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Does not meet standards | 1 |
Colton, Dean A. – 1993
Tables of specifications are used to guide test developers in sampling items and maintaining consistency from form to form. This paper is a generalizability study of the American College Testing Program (ACT) Achievement Program Mathematics Test (AAP), with the content areas of the table of specifications representing multiple dependent variables.…
Descriptors: Achievement Tests, Difficulty Level, Error of Measurement, Generalizability Theory
Bernstein, Lawrence; Burstein, Nancy – 1994
The inherent methodological problem in conducting research at multiple sites is how to best derive an overall estimate of program impact across multiple sites, best being the estimate that minimizes the mean square error, that is, the square of the difference between the observed and true values. An empirical example illustrates the use of the…
Descriptors: Bias, Comprehensive Programs, Data Analysis, Data Collection
Linacre, John M. – 1990
Advantages and disadvantages of standard Rasch analysis computer programs are discussed. The unconditional maximum likelihood algorithm allows all observations to participate equally in determining the measures and calibrations to be obtained quickly from a data set. On the advantage side, standard Rasch programs can be used immediately, are…
Descriptors: Algorithms, Computer Assisted Testing, Computer Graphics, Computer Simulation
Lockwood, Robert E.; And Others – 1986
Standards, passing scores, or cut scores have been seen as an element of criterion-referenced tests since their introduction. This paper discusses at least two issues surrounding the establishment of cut scores which appear to need clarification: (1) the theoretical definition of a cut score; and (2) decisions which must be made in selecting a…
Descriptors: Criterion Referenced Tests, Cutting Scores, Error of Measurement, High Schools
Shale, Doug – 1986
This study is an attempt at a cohesive characterization of the concept of essay reliability. As such, it takes as a basic premise that previous and current practices in reporting reliability estimates for essay tests have certain shortcomings. The study provides an analysis of these shortcomings--partly to encourage a fuller understanding of the…
Descriptors: Analysis of Variance, Correlation, Error of Measurement, Essay Tests
Wise, Lauress L. – 1986
A primary goal of this study was to determine the extent to which item difficulty was related to item position and, if a significant relationship was found, to suggest adjustments to predicted item difficulty that reflect differences in item position. Item response data from the Medical College Admission Test (MCAT) were analyzed. A data set was…
Descriptors: College Entrance Examinations, Difficulty Level, Educational Research, Error of Measurement
Goldberg, Gail Lynn; Walker-Bartnick, Leslie – 1988
A scoring rubric transition study is described. It was designed to evaluate possible drift in scoring the Maryland Writing Test from year to year (when using a modified holistic scoring method), to evaluate strategies for revising swing rubrics from narrative and explanatory writing while maintaining original scoring standards, and to establish…
Descriptors: Educational Assessment, Elementary Secondary Education, Error of Measurement, Grading
Lord, Frederic M. – 1983
If a loss function is available specifying the social cost of an error of measurement in the score on a unidimensional test, an asymptotic method, based on item response theory, is developed for optimal test design for a specified target population of examinees. Since in the real world such loss functions are not available, it is more useful to…
Descriptors: Cutting Scores, Decision Making, Error of Measurement, Estimation (Mathematics)
Wingersky, Marilyn S.; Lord, Frederic M. – 1983
The sampling errors of maximum likelihood estimates of item-response theory parameters are studied in the case where both people and item parameters are estimated simultaneously. A check on the validity of the standard error formulas is carried out. The effect of varying sample size, test length, and the shape of the ability distribution is…
Descriptors: Error of Measurement, Estimation (Mathematics), Item Banks, Latent Trait Theory
Jones, Eric D.; And Others – 1983
The purpose of this study was to evaluate the utility of out-of-level testing (OLT) when it is applied to the assessment of special education students with mild learning handicaps. This evaluation of OLT involved testing hypotheses related to: (1) the adequacy of vertical scaling, (2) the reliability and (3) the validity of OLT scores. Fifty-eight…
Descriptors: Educational Diagnosis, Error of Measurement, Guessing (Tests), Intermediate Grades
Cuttance, Peter F. – 1982
Covariance structure modelling is applied to the problem of estimating reliability and measurement error in survey data. To provide a basis for grouping certain question or variable types (data from questions), a simple typology based on the formal characteristics of the questions is outlined. From this classification, models for the different…
Descriptors: Analysis of Covariance, Correlation, Educational Research, Error of Measurement
Jones, Douglas H.; And Others – 1984
How accurately ability is estimated when the test model does not fit the data is considered. To address this question, this study investigated the accuracy of the maximum likelihood estimator of ability for the one-, two- and three-parameter logistic (PL) models. The models were fitted into generated item characteristic curves derived from the…
Descriptors: Ability, Aptitude Tests, Error of Measurement, Estimation (Mathematics)
van der Linden, Wim J. – 1982
A latent trait method is presented to investigate the possibility that Angoff or Nedelsky judges specify inconsistent probabilities in standard setting techniques for objectives-based instructional programs. It is suggested that judges frequently specify a low probability of success for an easy item but a large probability for a hard item. The…
Descriptors: Criterion Referenced Tests, Cutting Scores, Error of Measurement, Interrater Reliability
Mandeville, Garrett K. – 1978
The RMC Research Corporation evaluation model C1--the special regression model (SRM)--was evaluated through a series of computer simulations and compared with an alternative model, the norm referenced model (NRM). Using local data and national norm data to determine reasonable values for sample size and pretest posttest correlation parameters, the…
Descriptors: Analysis of Covariance, Error of Measurement, Intermediate Grades, Mathematical Models
Smith, Philip L. – 1980
Accurate estimation of variance components used in generalizability theory is essential for the theory to be viewed as an efficacious mechanism for studying the reliability and validity of a measurement procedure. This paper explores two alternatives for dealing with the apparent instability of small sample size used in determining the accuracy of…
Descriptors: Analysis of Variance, Error of Measurement, High Schools, Measurement Techniques