Publication Date
In 2025 | 39 |
Since 2024 | 192 |
Since 2021 (last 5 years) | 495 |
Since 2016 (last 10 years) | 996 |
Since 2006 (last 20 years) | 2028 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Researchers | 93 |
Practitioners | 23 |
Teachers | 22 |
Policymakers | 10 |
Administrators | 5 |
Students | 4 |
Counselors | 2 |
Parents | 2 |
Community | 1 |
Location
United States | 47 |
Germany | 42 |
Australia | 34 |
Canada | 27 |
Turkey | 27 |
California | 22 |
United Kingdom (England) | 20 |
Netherlands | 18 |
China | 16 |
New York | 15 |
United Kingdom | 15 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Does not meet standards | 1 |
Magis, David; Facon, Bruno – Educational and Psychological Measurement, 2013
Item purification is an iterative process that is often advocated as improving the identification of items affected by differential item functioning (DIF). With test-score-based DIF detection methods, item purification iteratively removes the items currently flagged as DIF from the test scores to get purified sets of items, unaffected by DIF. The…
Descriptors: Test Bias, Test Items, Statistical Analysis, Error of Measurement
Woodruff, David; Traynor, Anne; Cui, Zhongmin; Fang, Yu – ACT, Inc., 2013
Professional standards for educational testing recommend that both the overall standard error of measurement and the conditional standard error of measurement (CSEM) be computed on the score scale used to report scores to examinees. Several methods have been developed to compute scale score CSEMs. This paper compares three methods, based on…
Descriptors: Comparative Analysis, Error of Measurement, Scores, Scaling
Schweig, Jonathan – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2013
Measuring school and classroom environments has become central in a nation-wide effort to develop comprehensive programs that measure teacher quality and teacher effectiveness. Formulating successful programs necessitates accurate and reliable methods for measuring these environmental variables. This paper uses a generalizability theory framework…
Descriptors: Error of Measurement, Hierarchical Linear Modeling, Educational Environment, Classroom Environment
Huang, Francis L. – Practical Assessment, Research & Evaluation, 2014
Clustered data (e.g., students within schools) are often analyzed in educational research where data are naturally nested. As a consequence, multilevel modeling (MLM) has commonly been used to study the contextual or group-level (e.g., school) effects on individual outcomes. The current study investigates the use of an alternative procedure to…
Descriptors: Hierarchical Linear Modeling, Regression (Statistics), Educational Research, Sampling
Min, Shangchao; He, Lianzhen – Language Testing, 2014
This study examined the relative effectiveness of the multidimensional bi-factor model and multidimensional testlet response theory (TRT) model in accommodating local dependence in testlet-based reading assessment with both dichotomously and polytomously scored items. The data used were 14,089 test-takers' item-level responses to the testlet-based…
Descriptors: Foreign Countries, Item Response Theory, Reading Tests, Test Items
Citkowicz, Martyna; Polanin, Joshua R. – Society for Research on Educational Effectiveness, 2014
Meta-analyses are syntheses of effect-size estimates obtained from a collection of studies to summarize a particular field or topic (Hedges, 1992; Lipsey & Wilson, 2001). These reviews are used to integrate knowledge that can inform both scientific inquiry and public policy, therefore it is important to ensure that the estimates of the effect…
Descriptors: Meta Analysis, Accountability, Cluster Grouping, Effect Size
Aydin, Burak; Leite, Walter L.; Algina, James – Educational and Psychological Measurement, 2016
We investigated methods of including covariates in two-level models for cluster randomized trials to increase power to detect the treatment effect. We compared multilevel models that included either an observed cluster mean or a latent cluster mean as a covariate, as well as the effect of including Level 1 deviation scores in the model. A Monte…
Descriptors: Error of Measurement, Predictor Variables, Randomized Controlled Trials, Experimental Groups
Raymond, Mark R.; Swygert, Kimberly A.; Kahraman, Nilufer – Journal of Educational Measurement, 2012
Although a few studies report sizable score gains for examinees who repeat performance-based assessments, research has not yet addressed the reliability and validity of inferences based on ratings of repeat examinees on such tests. This study analyzed scores for 8,457 single-take examinees and 4,030 repeat examinees who completed a 6-hour clinical…
Descriptors: Physicians, Licensing Examinations (Professions), Performance Based Assessment, Repetition
Zopluoglu, Cengiz; Davenport, Ernest C., Jr. – Educational and Psychological Measurement, 2012
The generalized binomial test (GBT) and [omega] indices are the most recent methods suggested in the literature to detect answer copying behavior on multiple-choice tests. The [omega] index is one of the most studied indices, but there has not yet been a systematic simulation study for the GBT index. In addition, the effect of the ability levels…
Descriptors: Statistical Analysis, Error of Measurement, Simulation, Multiple Choice Tests
Zhang, Guangjian; Preacher, Kristopher J.; Jennrich, Robert I. – Psychometrika, 2012
The infinitesimal jackknife, a nonparametric method for estimating standard errors, has been used to obtain standard error estimates in covariance structure analysis. In this article, we adapt it for obtaining standard errors for rotated factor loadings and factor correlations in exploratory factor analysis with sample correlation matrices. Both…
Descriptors: Factor Analysis, Maximum Likelihood Statistics, Error of Measurement, Nonparametric Statistics
Taylor, Matthew A.; Skourides, Andreas; Alvero, Alicia M. – Journal of Organizational Behavior Management, 2012
Interval recording procedures are used by persons who collect data through observation to estimate the cumulative occurrence and nonoccurrence of behavior/events. Although interval recording procedures can increase the efficiency of observational data collection, they can also induce error from the observer. In the present study, 50 observers were…
Descriptors: Safety, Behavior, Error of Measurement, Observation
Raykov, Tenko; Marcoulides, George A. – Structural Equation Modeling: A Multidisciplinary Journal, 2012
A latent variable modeling method is outlined, which accomplishes estimation of criterion validity and reliability for a multicomponent measuring instrument with hierarchical structure. The approach provides point and interval estimates for the scale criterion validity and reliability coefficients, and can also be used for testing composite or…
Descriptors: Predictive Validity, Reliability, Structural Equation Models, Measures (Individuals)
Ip, Edward Hak-Sing; Chen, Shyh-Huei – Applied Psychological Measurement, 2012
The problem of fitting unidimensional item-response models to potentially multidimensional data has been extensively studied. The focus of this article is on response data that contains a major dimension of interest but that may also contain minor nuisance dimensions. Because fitting a unidimensional model to multidimensional data results in…
Descriptors: Measurement, Item Response Theory, Scores, Computation
Westfall, Peter H.; Henning, Kevin S. S.; Howell, Roy D. – Structural Equation Modeling: A Multidisciplinary Journal, 2012
This article shows how interfactor correlation is affected by error correlations. Theoretical and practical justifications for error correlations are given, and a new equivalence class of models is presented to explain the relationship between interfactor correlation and error correlations. The class allows simple, parsimonious modeling of error…
Descriptors: Psychometrics, Correlation, Error of Measurement, Structural Equation Models
Pelanek, Radek – Journal of Educational Data Mining, 2015
Researchers use many different metrics for evaluation of performance of student models. The aim of this paper is to provide an overview of commonly used metrics, to discuss properties, advantages, and disadvantages of different metrics, to summarize current practice in educational data mining, and to provide guidance for evaluation of student…
Descriptors: Models, Data Analysis, Data Processing, Evaluation Criteria