Publication Date
In 2025 | 5 |
Since 2024 | 19 |
Since 2021 (last 5 years) | 40 |
Since 2016 (last 10 years) | 78 |
Since 2006 (last 20 years) | 217 |
Descriptor
Error of Measurement | 355 |
Reliability | 355 |
Scores | 104 |
Correlation | 67 |
Statistical Analysis | 64 |
Validity | 63 |
Psychometrics | 47 |
Generalizability Theory | 44 |
Measurement Techniques | 44 |
Computation | 39 |
Foreign Countries | 36 |
More ▼ |
Source
Author
Raykov, Tenko | 11 |
Henson, Robin K. | 7 |
Kolen, Michael J. | 5 |
Livingston, Samuel A. | 5 |
Sijtsma, Klaas | 5 |
Fan, Xitao | 4 |
Haberman, Shelby J. | 4 |
Marcoulides, George A. | 4 |
Capraro, Robert M. | 3 |
Feldt, Leonard S. | 3 |
Lee, Guemin | 3 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 9 |
Policymakers | 1 |
Practitioners | 1 |
Students | 1 |
Teachers | 1 |
Location
United States | 7 |
Canada | 6 |
North Carolina | 5 |
Pennsylvania | 5 |
Portugal | 4 |
Spain | 4 |
Turkey | 4 |
Australia | 3 |
California | 3 |
China | 3 |
Germany | 3 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 3 |
Elementary and Secondary… | 1 |
Elementary and Secondary… | 1 |
Guaranteed Student Loan… | 1 |
Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Hsin-Yun Lee; You-Lin Chen; Li-Jen Weng – Journal of Experimental Education, 2024
The second version of Kaiser's Measure of Sampling Adequacy (MSA[subscript 2]) has been widely applied to assess the factorability of data in psychological research. The MSA[subscript 2] is developed in the population and little is known about its behavior in finite samples. If estimated MSA[subscript 2]s are biased due to sampling errors,…
Descriptors: Error of Measurement, Reliability, Sampling, Statistical Bias
William C. M. Belzak; Daniel J. Bauer – Journal of Educational and Behavioral Statistics, 2024
Testing for differential item functioning (DIF) has undergone rapid statistical developments recently. Moderated nonlinear factor analysis (MNLFA) allows for simultaneous testing of DIF among multiple categorical and continuous covariates (e.g., sex, age, ethnicity, etc.), and regularization has shown promising results for identifying DIF among…
Descriptors: Test Bias, Algorithms, Factor Analysis, Error of Measurement
Jonas Flodén – British Educational Research Journal, 2025
This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…
Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring
Stephanie M. Bell; R. Philip Chalmers; David B. Flora – Educational and Psychological Measurement, 2024
Coefficient omega indices are model-based composite reliability estimates that have become increasingly popular. A coefficient omega index estimates how reliably an observed composite score measures a target construct as represented by a factor in a factor-analysis model; as such, the accuracy of omega estimates is likely to depend on correct…
Descriptors: Influences, Models, Measurement Techniques, Reliability
Pornphan Sureeyatanapas; Panitas Sureeyatanapas; Uthumporn Panitanarak; Jittima Kraisriwattana; Patchanan Sarootyanapat; Daniel O'Connell – Language Testing in Asia, 2024
Ensuring consistent and reliable scoring is paramount in education, especially in performance-based assessments. This study delves into the critical issue of marking consistency, focusing on speaking proficiency tests in English language learning, which often face greater reliability challenges. While existing literature has explored various…
Descriptors: Foreign Countries, Students, English Language Learners, Speech
A. E. Ades; Nicky J. Welton; Sofia Dias; David M. Phillippo; Deborah M. Caldwell – Research Synthesis Methods, 2024
Network meta-analysis (NMA) is an extension of pairwise meta-analysis (PMA) which combines evidence from trials on multiple treatments in connected networks. NMA delivers internally consistent estimates of relative treatment efficacy, needed for rational decision making. Over its first 20 years NMA's use has grown exponentially, with applications…
Descriptors: Network Analysis, Meta Analysis, Medicine, Clinical Experience
Phillip K. Wood – Structural Equation Modeling: A Multidisciplinary Journal, 2024
The logistic and confined exponential curves are frequently used in studies of growth and learning. These models, which are nonlinear in their parameters, can be estimated using structural equation modeling software. This paper proposes a single combined model, a weighted combination of both models. Mplus, Proc Calis, and lavaan code for the model…
Descriptors: Structural Equation Models, Computation, Computer Software, Weighted Scores
Tenko Raykov – Educational and Psychological Measurement, 2024
This note is concerned with the benefits that can result from the use of the maximal reliability and optimal linear combination concepts in educational and psychological research. Within the widely used framework of unidimensional multi-component measuring instruments, it is demonstrated that the linear combination of their components that…
Descriptors: Educational Research, Behavioral Science Research, Reliability, Error of Measurement
Yan Xia; Selim Havan – Educational and Psychological Measurement, 2024
Although parallel analysis has been found to be an accurate method for determining the number of factors in many conditions with complete data, its application under missing data is limited. The existing literature recommends that, after using an appropriate multiple imputation method, researchers either apply parallel analysis to every imputed…
Descriptors: Data Interpretation, Factor Analysis, Statistical Inference, Research Problems
John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024
Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…
Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics
Teck Kiang Tan – Practical Assessment, Research & Evaluation, 2024
The procedures of carrying out factorial invariance to validate a construct were well developed to ensure the reliability of the construct that can be used across groups for comparison and analysis, yet mainly restricted to the frequentist approach. This motivates an update to incorporate the growing Bayesian approach for carrying out the Bayesian…
Descriptors: Bayesian Statistics, Factor Analysis, Programming Languages, Reliability
Vispoel, Walter P.; Lee, Hyeryung; Xu, Guanlan; Hong, Hyeri – Journal of Experimental Education, 2023
Although generalizability theory (GT) designs have traditionally been analyzed within an ANOVA framework, identical results can be obtained with structural equation models (SEMs) but extended to represent multiple sources of both systematic and measurement error variance, include estimation methods less likely to produce negative variance…
Descriptors: Generalizability Theory, Structural Equation Models, Programming Languages, Scores
Mohammad Mehdi Latifi; Dariush Tahmasebi Aghbelaghi; Sajad Khani Pordanjani – European Journal of Education, 2025
The present study sought to assess the psychometric properties of the Iranian adaptation of the Vietnam Teacher Resilience Scale for Asia (VITRS), referred to as the Iranian Teachers' Resilience Scale (ITRS) and to examine its measurement invariance across middle and high school teachers in Iran. In total, 700 participants completed the…
Descriptors: Resilience (Psychology), Error of Measurement, Factor Analysis, Teacher Attitudes
Najera, Hector – Measurement: Interdisciplinary Research and Perspectives, 2023
Measurement error affects the quality of population orderings of an index and, hence, increases the misclassification of the poor and the non-poor groups and affects statistical inferences from binary regression models. Hence, the conclusions about the extent, profile, and distribution of poverty are likely to be misleading. However, the size and…
Descriptors: Poverty, Error of Measurement, Classification, Statistical Inference
Zachary del Rosario – Journal of Statistics and Data Science Education, 2024
Variability is underemphasized in domains such as engineering. Statistics and data science education research offers a variety of frameworks for understanding variability, but new frameworks for domain applications are necessary. This study investigated the professional practices of working engineers to develop such a framework. The Neglected,…
Descriptors: Foreign Countries, Engineering Education, Engineering, Technical Occupations