Publication Date
In 2025 | 39 |
Since 2024 | 192 |
Since 2021 (last 5 years) | 495 |
Since 2016 (last 10 years) | 996 |
Since 2006 (last 20 years) | 2028 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Researchers | 93 |
Practitioners | 23 |
Teachers | 22 |
Policymakers | 10 |
Administrators | 5 |
Students | 4 |
Counselors | 2 |
Parents | 2 |
Community | 1 |
Location
United States | 47 |
Germany | 42 |
Australia | 34 |
Canada | 27 |
Turkey | 27 |
California | 22 |
United Kingdom (England) | 20 |
Netherlands | 18 |
China | 16 |
New York | 15 |
United Kingdom | 15 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Does not meet standards | 1 |
Guhn, Martin; Gadermann, Anne; Zumbo, Bruno D. – Early Education and Development, 2007
The present study investigates whether the Early Development Instrument (Offord & Janus, 1999) measures school readiness similarly across different groups of children. We employ ordinal logistic regression to investigate differential item functioning, a method of examining measurement bias. For 40,000 children, our analysis compares groups…
Descriptors: School Readiness, Kindergarten, Child Development, Program Validation
Chris G. Richardson; Pamela A. Ratner; Bruno D. Zumbo – Educational and Psychological Measurement, 2007
The purpose of this investigation was to test the age-related measurement invariance and temporal stability of the 13-item version of Antonovsky's Sense of Coherence Scale (SOC). Multigroup structural equation modeling of longitudinal data from the Canadian National Population Health Survey was used to examine the measurement invariance across 3…
Descriptors: Foreign Countries, Measures (Individuals), Structural Equation Models, Investigations
Livingston, Samuel A.; Lewis, Charles – 1993
This paper presents a method for estimating the accuracy and consistency of classifications based on test scores. The scores can be produced by any scoring method, including the formation of a weighted composite. The estimates use data from a single form. The reliability of the score is used to estimate its effective test length in terms of…
Descriptors: Classification, Error of Measurement, Estimation (Mathematics), Reliability
Sheehan, Kathleen M.; Mislevy, Robert J. – 1988
In many practical applications of item response theory, the parameters of overlapping subsets of test items are estimated from different samples of examinees. A linking procedure is then employed to place the resulting item parameter estimates onto a common scale. It is standard practice to ignore the uncertainty associated with the linking step…
Descriptors: Error of Measurement, Estimation (Mathematics), Item Response Theory, Measurement Techniques
De Ayala, R. J.; And Others – 1995
Expected a posteriori has a number of advantages over maximum likelihood estimation or maximum a posteriori (MAP) estimation methods. These include ability estimates (thetas) for all response patterns, less regression towards the mean than MAP ability estimates, and a lower average squared error. R. D. Bock and R. J. Mislevy (1982) state that the…
Descriptors: Adaptive Testing, Bayesian Statistics, Error of Measurement, Estimation (Mathematics)
Wingersky, Marilyn S. – 1989
In a variable-length adaptive test with a stopping rule that relied on the asymptotic standard error of measurement of the examinee's estimated true score, M. S. Stocking (1987) discovered that it was sufficient to know the examinee's true score and the number of items administered to predict with some accuracy whether an examinee's true score was…
Descriptors: Adaptive Testing, Bayesian Statistics, Error of Measurement, Estimation (Mathematics)
Lockridge, Jewel – 1997
Researchers persist in using stepwise regression in spite of problems with this approach. As noted by B. Thompson (1995), three problems accompany the use of stepwise applications. The first is that computer packages may use incorrect degrees of freedom in their computations, resulting in a greater likelihood of obtaining a spurious statistical…
Descriptors: Computer Oriented Programs, Error of Measurement, Predictor Variables, Research Methodology
Betebenner, Damian W. – 1998
The zeitgeist for reform in education precipitated a number of changes in assessment. Among these are performance assessments, sometimes linked to "high stakes" accountability decisions. In some instances, the trustworthiness of these decisions is based on variance components and error variances derived through generalizability theory.…
Descriptors: Accountability, Educational Change, Error of Measurement, Generalizability Theory
Schumacker, Randall E. – 1998
In comparing measurement theories, it is evident that the awareness of the concept of measurement error during the time of Galileo has lead to the formulation of observed scores comprising a true score and error (classical theory), universe score and various random error components (generalizability theory), or individual latent ability and error…
Descriptors: Comparative Analysis, Computer Software, Error of Measurement, Generalizability Theory
van der Linden, Wim J.; Glas, Cees A. W. – 1998
In adaptive testing, item selection is sequentially optimized during the test. Since the optimization takes place over a pool of items calibrated with estimation error, capitalization on these errors is likely to occur. How serious the consequences of this phenomenon are depends not only on the distribution of the estimation errors in the pool or…
Descriptors: Ability, Adaptive Testing, Computer Assisted Testing, Error of Measurement

Sanders, Steven G. – Journal of College Science Teaching, 1975
Several techniques to use in evaluation and grading are presented. Some grading problems are discussed briefly. (PEB)
Descriptors: Error of Measurement, Evaluation, Evaluation Methods, Grading
Haberman, Shelby J. – ETS Research Report Series, 2005
In educational tests, subscores are often generated from a portion of the items in a larger test. Guidelines based on mean-squared error are proposed to indicate whether subscores are worth reporting. Alternatives considered are direct reports of subscores, estimates of subscores based on total score, combined estimates based on subscores and…
Descriptors: Scores, Test Items, Error of Measurement, Computation
Gardner, Eric – 1989
Five of the common misuses of tests are reviewed: (1) acceptance of the test title as an accurate and complete description of the variable being measured (failure to examine the manual and the items carefully to know the specific aspects to be tested can result in misuse through selection of an inappropriate test for a particular purpose or…
Descriptors: Error of Measurement, Evaluation Problems, Examiners, Scoring
Baldwin, Beatrice; Lomax, Richard – 1990
This LISREL study examines the robustness of the maximum likelihood estimates under varying degrees of measurement model misspecification. A true model containing five latent variables (two endogenous and three exogenous) and two indicator variables per latent variable was used. Measurement model misspecification considered included errors of…
Descriptors: Computer Software, Error of Measurement, Item Response Theory, Mathematical Models
Zwick, Rebecca – 1986
Most currently used measures of inter-rater agreement for the nominal case incorporate a correction for "chance agreement." The definition of chance agreement is not the same for all coefficients, however. Three chance-corrected coefficients are Cohen's Kappa; Scott's Pi; and the S index of Bennett, Goldstein, and Alpert, which has…
Descriptors: Error of Measurement, Interrater Reliability, Mathematical Models, Measurement Techniques