Publication Date
In 2025 | 39 |
Since 2024 | 192 |
Since 2021 (last 5 years) | 495 |
Since 2016 (last 10 years) | 996 |
Since 2006 (last 20 years) | 2028 |
Descriptor
Error of Measurement | 3295 |
Statistical Analysis | 599 |
Scores | 504 |
Item Response Theory | 445 |
Correlation | 434 |
Comparative Analysis | 422 |
Foreign Countries | 415 |
Test Reliability | 408 |
Computation | 404 |
Simulation | 370 |
Reliability | 355 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Researchers | 93 |
Practitioners | 23 |
Teachers | 22 |
Policymakers | 10 |
Administrators | 5 |
Students | 4 |
Counselors | 2 |
Parents | 2 |
Community | 1 |
Location
United States | 47 |
Germany | 42 |
Australia | 34 |
Canada | 27 |
Turkey | 27 |
California | 22 |
United Kingdom (England) | 20 |
Netherlands | 18 |
China | 16 |
New York | 15 |
United Kingdom | 15 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Does not meet standards | 1 |
Turner, Kyle T.; Engelhard, George, Jr. – Measurement: Interdisciplinary Research and Perspectives, 2023
The purpose of this study is to illustrate the use of functional data analysis (FDA) as a general methodology for analyzing person response functions (PRFs). Applications of FDA to psychometrics have included the estimation of item response functions and latent distributions, as well as differential item functioning. Although FDA has been…
Descriptors: Data Analysis, Item Response Theory, Psychometrics, Statistical Distributions
Lockwood, Adam B.; Klatka, Kelsey; Parker, Brandon; Benson, Nicholas – Journal of Psychoeducational Assessment, 2023
Eighty Woodcock-Johnson IV Tests of Achievement protocols from 40 test administrators were examined to determine the types and frequencies of administration and scoring errors made. Non-critical errors (e.g., failure to record verbatim) were found on every protocol (M = 37.2). Critical (e.g., standard score, start point) errors were found on 98.8%…
Descriptors: Achievement Tests, Testing, Scoring, Error of Measurement
Mohsen Dolatabadi – Australian Journal of Applied Linguistics, 2023
Many datasets resulting from participant ratings for word norms and also concreteness ratios are available. However, the concreteness information of infrequent words and non-words is rare. This work aims to propose a model for estimating the concreteness of infrequent and new lexicons. Here, we used Lancaster sensory-motor word norms to predict…
Descriptors: Prediction, Validity, Models, Computational Linguistics
Wu, Tong – ProQuest LLC, 2023
This three-article dissertation aims to address three methodological challenges to ensure comparability in educational research, including scale linking, test equating, and propensity score (PS) weighting. The first study intends to improve test scale comparability by evaluating the effect of six missing data handling approaches, including…
Descriptors: Educational Research, Comparative Analysis, Equated Scores, Weighted Scores
van Rensburg, Clarisse; Mostert, Karina – Journal of Student Affairs in Africa, 2023
Student well-being has gradually become a topic of interest in higher education, and the accurate, valid, and reliable measure of well-being constructs is crucial in the South African context. This study examined item bias and configural, metric and scalar invariance of the Satisfaction with Life Scale (SWLS) for South African first-year…
Descriptors: Life Satisfaction, Measures (Individuals), Foreign Countries, College Freshmen
Reimers, Jennifer; Turner, Ronna C.; Tendeiro, Jorge N.; Lo, Wen-Juo; Keiffer, Elizabeth – Measurement: Interdisciplinary Research and Perspectives, 2023
Person-fit analyses are commonly used to detect aberrant responding in self-report data. Nonparametric person fit statistics do not require fitting a parametric test theory model and have performed well compared to other person-fit statistics. However, detection of aberrant responding has primarily focused on dominance response data, thus the…
Descriptors: Goodness of Fit, Nonparametric Statistics, Error of Measurement, Comparative Analysis
Martí, Mónica; Ródenas, Carmen – International Journal of Social Research Methodology, 2021
This paper analyses the reliability and accuracy of the relationships between migration and employment status when estimated using a linked data set. The analysis will be carried out using a new source, the "Labour and Geographical Mobility Statistics," which is provided by the Spanish Statistical Office. This statistic is constructed by…
Descriptors: Foreign Countries, Error of Measurement, Occupational Mobility, Migration
Jamshidi, Laleh; Declercq, Lies; Fernández-Castilla, Belén; Ferron, John M.; Moeyaert, Mariola; Beretvas, S. Natasha; Van den Noortgate, Wim – Journal of Experimental Education, 2021
Previous research found bias in the estimate of the overall fixed effects and variance components using multilevel meta-analyses of standardized single-case data. Therefore, we evaluate two adjustments in an attempt to reduce the bias and improve the statistical properties of the parameter estimates. The results confirm the existence of bias when…
Descriptors: Statistical Bias, Multivariate Analysis, Meta Analysis, Research Design
Arribas, E.; Escobar, I.; Ramirez-Vazquez, R. – International Journal of Mathematical Education in Science and Technology, 2021
In the article 'How Long Is My Toilet Roll--A Simple Exercise in Mathematical Modelling' several models of increasing complexity are introduced and solved to calculate indirectly the length of paper on a toilet-roll. All these results are presented without errors. The authors of this comment believe the error analysis of measurements made in a…
Descriptors: Mathematics Instruction, Teaching Methods, Mathematical Models, Computation
Casabianca, Jodi M. – Educational Measurement: Issues and Practice, 2021
Module Overview: In this digital ITEMS module, Dr. Jodi M. Casabianca provides a primer on the "hierarchical rater model" (HRM) framework and the recent expansions to the model for analyzing raters and ratings of constructed responses. In the first part of the module, she establishes an understanding of the nature of constructed…
Descriptors: Hierarchical Linear Modeling, Rating Scales, Error of Measurement, Item Response Theory
Cross, Rod – Physics Teacher, 2021
A common procedure when conducting physics experiments is to repeat a measurement several times to calculate the mean and standard deviation. That might be the only instruction we give to students as a means to minimize random errors. However, that technique does not guarantee that the answer will be correct. It might give the same wrong answer…
Descriptors: Physics, Science Experiments, Computation, Error of Measurement
Fernández-Castilla, Belén; Declercq, Lies; Jamshidi, Laleh; Beretvas, S. Natasha; Onghena, Patrick; Van den Noortgate, Wim – Journal of Experimental Education, 2021
This study explores the performance of classical methods for detecting publication bias--namely, Egger's regression test, Funnel Plot test, Begg's Rank Correlation and Trim and Fill method--in meta-analysis of studies that report multiple effects. Publication bias, outcome reporting bias, and a combination of these were generated. Egger's…
Descriptors: Statistical Bias, Meta Analysis, Publications, Regression (Statistics)
Höhne, Jan Karem; Krebs, Dagmar – International Journal of Social Research Methodology, 2021
Measuring respondents' attitudes is a crucial task in numerous social science disciplines. A popular way to measure attitudes is to use survey questions with rating scales. However, research has shown that especially the design of rating scales can have a profound impact on respondents' answer behavior. While some scale design aspects, such as…
Descriptors: Attitude Measures, Rating Scales, Telephone Surveys, Response Style (Tests)
Rank-Normalization, Folding, and Localization: An Improved [R-Hat] for Assessing Convergence of MCMC
Aki Vehtari; Andrew Gelman; Daniel Simpson; Bob Carpenter; Paul-Christian Burkner – Grantee Submission, 2021
Markov chain Monte Carlo is a key computational tool in Bayesian statistics, but it can be challenging to monitor the convergence of an iterative stochastic algorithm. In this paper we show that the convergence diagnostic [R-hat] of Gelman and Rubin (1992) has serious flaws. Traditional [R-hat] will fail to correctly diagnose convergence failures…
Descriptors: Markov Processes, Monte Carlo Methods, Bayesian Statistics, Efficiency
Sophie Lilit Litschwartz – ProQuest LLC, 2021
In education research test scores are a common object of analysis. Across studies test scores can be an important outcome, a highly predictive covariate, or a means of assigning treatment. However, test scores are a measure of an underlying proficiency we can't observe directly and so contain error. This measurement error has implications for how…
Descriptors: Scores, Inferences, Educational Research, Evaluation Methods