Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 9 |
Descriptor
Error of Measurement | 15 |
Measurement Techniques | 15 |
Probability | 15 |
Evaluation Methods | 6 |
Item Analysis | 5 |
Psychometrics | 5 |
Data Analysis | 4 |
Decision Making | 3 |
Goodness of Fit | 3 |
Item Response Theory | 3 |
Models | 3 |
More ▼ |
Source
Author
Bashaw, W. L. | 1 |
Bramble, William | 1 |
Brigman, S. Leellen | 1 |
Cai, Li | 1 |
Carstensen, Claus H. | 1 |
Dawis, Rene V. | 1 |
DeMars, Christine E. | 1 |
Dirkzwager, Arie | 1 |
Gundersen, Craig | 1 |
Hager, Willi | 1 |
Hessen, David J. | 1 |
More ▼ |
Publication Type
Journal Articles | 11 |
Reports - Research | 8 |
Reports - Evaluative | 3 |
Reports - Descriptive | 2 |
Speeches/Meeting Papers | 1 |
Education Level
Higher Education | 2 |
Postsecondary Education | 2 |
Grade 9 | 1 |
High Schools | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Secondary Education | 1 |
Audience
Location
Germany | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Longford, Nicholas Tibor – Journal of Educational and Behavioral Statistics, 2016
We address the problem of selecting the best of a set of units based on a criterion variable, when its value is recorded for every unit subject to estimation, measurement, or another source of error. The solution is constructed in a decision-theoretical framework, incorporating the consequences (ramifications) of the various kinds of error that…
Descriptors: Decision Making, Classification, Guidelines, Undergraduate Students
Sekercioglu, Güçlü – International Online Journal of Education and Teaching, 2018
An empirical evidence for independent samples of a population regarding measurement invariance implies that factor structure of a measurement tool is equal across these samples; in other words, it measures the intended psychological trait within the same structure. In this case, the evidence of construct validity would be strengthened within the…
Descriptors: Factor Analysis, Error of Measurement, Factor Structure, Construct Validity
Tijmstra, Jesper; Hessen, David J.; van der Heijden, Peter G. M.; Sijtsma, Klaas – Psychometrika, 2013
Most dichotomous item response models share the assumption of latent monotonicity, which states that the probability of a positive response to an item is a nondecreasing function of a latent variable intended to be measured. Latent monotonicity cannot be evaluated directly, but it implies manifest monotonicity across a variety of observed scores,…
Descriptors: Item Response Theory, Statistical Inference, Probability, Psychometrics
Köhler, Carmen; Pohl, Steffi; Carstensen, Claus H. – Educational and Psychological Measurement, 2015
When competence tests are administered, subjects frequently omit items. These missing responses pose a threat to correctly estimating the proficiency level. Newer model-based approaches aim to take nonignorable missing data processes into account by incorporating a latent missing propensity into the measurement model. Two assumptions are typically…
Descriptors: Competence, Tests, Evaluation Methods, Adults
Pelanek, Radek – Journal of Educational Data Mining, 2015
Researchers use many different metrics for evaluation of performance of student models. The aim of this paper is to provide an overview of commonly used metrics, to discuss properties, advantages, and disadvantages of different metrics, to summarize current practice in educational data mining, and to provide guidance for evaluation of student…
Descriptors: Models, Data Analysis, Data Processing, Evaluation Criteria
Cai, Li; Monroe, Scott – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2014
We propose a new limited-information goodness of fit test statistic C[subscript 2] for ordinal IRT models. The construction of the new statistic lies formally between the M[subscript 2] statistic of Maydeu-Olivares and Joe (2006), which utilizes first and second order marginal probabilities, and the M*[subscript 2] statistic of Cai and Hansen…
Descriptors: Item Response Theory, Models, Goodness of Fit, Probability
Warrens, Matthijs J. – Psychometrika, 2008
We discuss properties that association coefficients may have in general, e.g., zero value under statistical independence, and we examine coefficients for 2x2 tables with respect to these properties. Furthermore, we study a family of coefficients that are linear transformations of the observed proportion of agreement given the marginal…
Descriptors: Probability, Error of Measurement, Psychometrics, Measurement Techniques
Hutchison, Dougal – Oxford Review of Education, 2008
There is a degree of instability in any measurement, so that if it is repeated, it is possible that a different result may be obtained. Such instability, generally described as "measurement error", may affect the conclusions drawn from an investigation, and methods exist for allowing it. It is less widely known that different disciplines, and…
Descriptors: Measurement Techniques, Data Analysis, Error of Measurement, Test Reliability
Gundersen, Craig; Kreider, Brent – Journal of Human Resources, 2008
Policymakers have been puzzled to observe that food stamp households appear more likely to be food insecure than observationally similar eligible nonparticipating households. We reexamine this issue allowing for nonclassical reporting errors in food stamp participation and food insecurity. Extending the literature on partially identified…
Descriptors: Security (Psychology), Poverty, Family (Sociological Unit), Measurement Techniques

Whitely, Susan E.; Dawis, Rene V. – Journal of Educational Measurement, 1974
Descriptors: Error of Measurement, Item Analysis, Matrices, Measurement Techniques

Westermann, Rainer; Hager, Willi – Journal of Educational Statistics, 1986
The well-known problem of cumulating error probabilities is reconsidered from a general epistemological perspective, namely, the concepts of severity and of fairness of tests. It is shown that not only Type 1 but also Type 2 errors can cumulate. A new adjustment strategy is proposed and applied. (Author/JAZ)
Descriptors: Educational Research, Error of Measurement, Hypothesis Testing, Measurement Techniques
DeMars, Christine E. – Applied Psychological Measurement, 2004
Type I error rates were examined for several fit indices available in GGUM2000: extensions of Infit, Outfit, Andrich's X(2), and the log-likelihood ratio X(2). Infit and Outfit had Type I error rates much lower than nominal alpha. Andrich's X(2) had Type I error rates much higher than nominal alpha, particularly for shorter tests or larger sample…
Descriptors: Likert Scales, Error of Measurement, Goodness of Fit, Psychological Studies
Brigman, S. Leellen; Bashaw, W. L. – 1976
Procedures are presented for equating simultaneously several tests which have been calibrated by the Rasch Model. Three multiple test equating designs are described. A Full Matrix Design equates each test to all others. A Chain Design links tests sequentially. A Vector Design equates one test to each of the other tests. For each design, the Rasch…
Descriptors: Ability, Achievement Tests, Computer Programs, Equated Scores
Dirkzwager, Arie – International Journal of Testing, 2003
The crux in psychometrics is how to estimate the probability that a respondent answers an item correctly on one occasion out of many. Under the current testing paradigm this probability is estimated using all kinds of statistical techniques and mathematical modeling. Multiple evaluation is a new testing paradigm using the person's own personal…
Descriptors: Psychometrics, Probability, Models, Measurement
Kifer, Edward; Bramble, William – 1974
A latent trait model, the Rasch, was fitted to a criterion-referenced test. Approximately 90 percent of the items fit the model. Those items which fit the model were then calibrated. Based on the item calibration, individual ability estimates and the standard errors of those estimates were calculated. Using the ability estimates, it was possible,…
Descriptors: Academic Ability, Achievement Tests, Criterion Referenced Tests, Decision Making