Publication Date
| In 2026 | 0 |
| Since 2025 | 53 |
| Since 2022 (last 5 years) | 411 |
| Since 2017 (last 10 years) | 914 |
| Since 2007 (last 20 years) | 1965 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 93 |
| Practitioners | 23 |
| Teachers | 22 |
| Policymakers | 10 |
| Administrators | 5 |
| Students | 4 |
| Counselors | 2 |
| Parents | 2 |
| Community | 1 |
Location
| United States | 47 |
| Germany | 42 |
| Australia | 34 |
| Canada | 27 |
| Turkey | 27 |
| California | 22 |
| United Kingdom (England) | 20 |
| Netherlands | 18 |
| China | 17 |
| New York | 15 |
| United Kingdom | 15 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Does not meet standards | 1 |
Monahan, Patrick O.; Ankenmann, Robert D. – Applied Psychological Measurement, 2010
When the matching score is either less than perfectly reliable or not a sufficient statistic for determining latent proficiency in data conforming to item response theory (IRT) models, Type I error (TIE) inflation may occur for the Mantel-Haenszel (MH) procedure or any differential item functioning (DIF) procedure that matches on summed-item…
Descriptors: Error of Measurement, Item Response Theory, Test Bias, Scores
Effects of Ventilation on Segmental Bioimpedance Spectroscopy Measures Using Generalizability Theory
Turner, A. Allan; Lozano-Nieto, Albert; Bouffard, Marcel – Measurement in Physical Education and Exercise Science, 2010
The purpose of this study was to examine the effect of three ventilation conditions (i.e., normal, regimented, and no-ventilation) on the reproducibility of bioimpedance scores in humans for the forearm and trunk segments. One hundred able-bodied North American men and women, from 18 to 71 years of age, volunteered as participants. The…
Descriptors: Ventilation, Generalizability Theory, Spectroscopy, Scores
Burns, Matthew K.; Scholin, Sarah E.; Kosciolek, Stacey; Livingston, Judy – Journal of Psychoeducational Assessment, 2010
The current study examines the consistency of two response-to-intervention (RTI) decision-making models. Weekly progress monitoring data for 30 students participating in a Tier II intervention were collected for 30 weeks. The data were examined by comparing them to an aimline with a yearly goal and by computing a dual discrepancy (DD) using…
Descriptors: Reading Achievement, Reading Tests, Data Collection, Responses
Wen, Zhonglin; Marsh, Herbert W.; Hau, Kit-Tai – Structural Equation Modeling: A Multidisciplinary Journal, 2010
Standardized parameter estimates are routinely used to summarize the results of multiple regression models of manifest variables and structural equation models of latent variables, because they facilitate interpretation. Although the typical standardization of interaction terms is not appropriate for multiple regression models, straightforward…
Descriptors: Structural Equation Models, Multiple Regression Analysis, Interaction, Computation
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2009
We derive an estimator of the standardized value which, under the standard assumptions of normality and homoscedasticity, is more efficient than the established (asymptotically efficient) estimator and discuss its gains for small samples. (Contains 1 table and 3 figures.)
Descriptors: Efficiency, Computation, Statistics, Sample Size
Sijtsma, Klaas – Psychometrika, 2009
This discussion paper argues that both the use of Cronbach's alpha as a reliability estimate and as a measure of internal consistency suffer from major problems. First, alpha always has a value, which cannot be equal to the test score's reliability given the inter-item covariance matrix and the usual assumptions about measurement error. Second, in…
Descriptors: Measurement, Error of Measurement, Scores, Computation
Wang, Lihui; Lawson, Michael J.; Curtis, David D. – Language Teaching Research, 2015
Imagery training has been shown to improve reading comprehension. Recent research has also shown that the quality of visual mental imagery used is important for reading comprehension. A review of literature shows that there has been relatively little detailed research on the quality of imagery used by learners, especially in the case of students…
Descriptors: Educational Quality, Teaching Methods, English (Second Language), Second Language Learning
Hung, Su-Pin; Chen, Po-Hsi; Chen, Hsueh-Chih – Creativity Research Journal, 2012
Product assessment is widely applied in creative studies, typically as an important dependent measure. Within this context, this study had 2 purposes. First, the focus of this research was on methods for investigating possible rater effects, an issue that has not received a great deal of attention in past creativity studies. Second, the…
Descriptors: Item Response Theory, Creativity, Interrater Reliability, Undergraduate Students
Liu, Qin – Association for Institutional Research, 2012
This discussion constructs a survey data quality strategy for institutional researchers in higher education in light of total survey error theory. It starts with describing the characteristics of institutional research and identifying the gaps in literature regarding survey data quality issues in institutional research and then introduces the…
Descriptors: Institutional Research, Higher Education, Quality Control, Researchers
Kolen, Michael J.; Lee, Won-Chan – Educational Measurement: Issues and Practice, 2011
This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…
Descriptors: Test Use, Test Format, Error of Measurement, Raw Scores
Lee, C. Matthew; Gorelick, Mark – Measurement in Physical Education and Exercise Science, 2011
The purpose of this study was to examine the validity of the Smarthealth watch (Salutron, Inc., Fremont, California, USA), a heart rate monitor that includes a wristwatch without an accompanying chest strap. Twenty-five individuals participated in 3-min periods of standing, 2.0 mph walking, 3.5 mph walking, 4.5 mph jogging, and 6.0 mph running.…
Descriptors: Metabolism, Intervals, Physical Activities, Validity
Steele, Joel S.; Ferrer, Emilio – Multivariate Behavioral Research, 2011
We examine emotion self-regulation and coregulation in romantic couples using daily self-reports of positive and negative affect. We fit these data using a damped linear oscillator model specified as a latent differential equation to investigate affect dynamics at the individual level and coupled influences for the 2 partners in each couple.…
Descriptors: Affective Behavior, Calculus, Models, Females
Wang, Binhong – English Language Teaching, 2010
This paper first analyzed two studies on rater factors and rating criteria to raise the problem of rater agreement. After that the author reveals the causes of discrepencies in rating administration by discussing rater variability and rater bias. The author argues that rater bias can not be eliminated completely, we can only reduce the error to a…
Descriptors: Interrater Reliability, Examiners, Training, Bias
Haberman, Shelby J. – Educational Testing Service, 2010
Sampling errors limit the accuracy with which forms can be linked. Limitations on accuracy are especially important in testing programs in which a very large number of forms are employed. Standard inequalities in mathematical statistics may be used to establish lower bounds on the achievable inking accuracy. To illustrate results, a variety of…
Descriptors: Testing Programs, Equated Scores, Sampling, Accuracy
Dimitrov, Dimiter M. – Mid-Western Educational Researcher, 2010
The focus of this presidential address is on the contemporary treatment of reliability and validity in educational assessment. Highlights on reliability are provided under the classical true-score model using tools from latent trait modeling to clarify important assumptions and procedures for reliability estimation. In addition to reliability,…
Descriptors: Educational Assessment, Validity, Item Response Theory, Reliability

Peer reviewed
Direct link
