Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 8 |
Descriptor
Error of Measurement | 12 |
Probability | 12 |
Reliability | 12 |
Measurement | 4 |
Classification | 3 |
Test Items | 3 |
True Scores | 3 |
Academic Achievement | 2 |
Correlation | 2 |
Cutting Scores | 2 |
Evaluation | 2 |
More ▼ |
Source
Author
Blai, Boris, Jr. | 1 |
Bramley, Tom | 1 |
Emons, Wilco H. M. | 1 |
Kannan, Priya | 1 |
Katz, Irvin R. | 1 |
Kruyen, Peter M. | 1 |
Li, Tenglong | 1 |
Livingston, Samuel A. | 1 |
Marcoulides, George A. | 1 |
Munyofu, Paul | 1 |
Pelanek, Radek | 1 |
More ▼ |
Publication Type
Journal Articles | 7 |
Reports - Evaluative | 4 |
Reports - Research | 4 |
Dissertations/Theses -… | 1 |
Reports - Descriptive | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Audience
Location
Pennsylvania | 1 |
United Kingdom (England) | 1 |
Laws, Policies, & Programs
Assessments and Surveys
California Learning… | 1 |
Work Keys (ACT) | 1 |
What Works Clearinghouse Rating
Raykov, Tenko; Marcoulides, George A.; Li, Tenglong – Educational and Psychological Measurement, 2018
This note extends the results in the 2016 article by Raykov, Marcoulides, and Li to the case of correlated errors in a set of observed measures subjected to principal component analysis. It is shown that when at least two measures are fallible, the probability is zero for any principal component--and in particular for the first principal…
Descriptors: Factor Analysis, Error of Measurement, Correlation, Reliability
Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015
The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…
Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items
Pelanek, Radek – Journal of Educational Data Mining, 2015
Researchers use many different metrics for evaluation of performance of student models. The aim of this paper is to provide an overview of commonly used metrics, to discuss properties, advantages, and disadvantages of different metrics, to summarize current practice in educational data mining, and to provide guidance for evaluation of student…
Descriptors: Models, Data Analysis, Data Processing, Evaluation Criteria
Zhang, Zuoyi – ProQuest LLC, 2011
This thesis consists of two parts. In Chapter 2, we focus on the spatial scan statistics with overdispersion and Chapter 3 is devoted to the randomized permutation test for identifying local patterns of spatial association. The spatial scan statistic has been widely used in spatial disease surveillance and spatial cluster detection. To apply it, a…
Descriptors: Statistical Distributions, Probability, Cluster Grouping, Multivariate Analysis
Kruyen, Peter M.; Emons, Wilco H. M.; Sijtsma, Klaas – International Journal of Testing, 2012
Personnel selection shows an enduring need for short stand-alone tests consisting of, say, 5 to 15 items. Despite their efficiency, short tests are more vulnerable to measurement error than longer test versions. Consequently, the question arises to what extent reducing test length deteriorates decision quality due to increased impact of…
Descriptors: Measurement, Personnel Selection, Decision Making, Error of Measurement
Munyofu, Paul – Performance Improvement Quarterly, 2010
The state of Pennsylvania, like many organizations interested in performance improvement, routinely engages in professional development activities. Educators in this hands-on activity engaged in setting meaningful criterion-referenced cut scores for career and technical education assessments using two methods. The main purposes of this study were…
Descriptors: Standard Setting, Cutting Scores, Professional Development, Vocational Education
Bramley, Tom – Educational Research, 2010
Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…
Descriptors: National Curriculum, Educational Research, Testing, Measurement
Solano-Flores, Guillermo – Educational Researcher, 2008
The testing of English language learners (ELLs) is, to a large extent, a random process because of poor implementation and factors that are uncertain or beyond control. Yet current testing practices and policies appear to be based on deterministic views of language and linguistic groups and erroneous assumptions about the capacity of assessment…
Descriptors: Generalizability Theory, Testing, Second Language Learning, Error of Measurement

Zimmerman, Donald W. – Psychological Reports, 1971
Descriptors: Error of Measurement, Mathematical Concepts, Measurement, Models
Livingston, Samuel A. – 1976
A distinction is made between reliability of measurement and reliability of classification; the "criterion-referenced reliability coefficient" describes the former. Application of this coefficient to the probability distribution of possible scores for a single student yields a meaningful way to describe the reliability of a single score. (Author)
Descriptors: Classification, Criterion Referenced Tests, Error of Measurement, Measurement
Blai, Boris, Jr. – 1971
Statistics are an essential tool for making proper judgement decisions. It is concerned with probability distribution models, testing of hypotheses, significance tests and other means of determining the correctness of deductions and the most likely outcome of decisions. Measures of central tendency include the mean, median and mode. A second…
Descriptors: Analysis of Variance, Correlation, Error of Measurement, Hypothesis Testing
Wang, Tianyou; And Others – 1996
M. J. Kolen, B. A. Hanson, and R. L. Brennan (1992) presented a procedure for assessing the conditional standard error of measurement (CSEM) of scale scores using a strong true-score model. They also investigated the ways of using nonlinear transformation from number-correct raw score to scale score to equalize the conditional standard error along…
Descriptors: Ability, Classification, Error of Measurement, Goodness of Fit