Publication Date
| In 2026 | 0 |
| Since 2025 | 59 |
| Since 2022 (last 5 years) | 416 |
| Since 2017 (last 10 years) | 919 |
| Since 2007 (last 20 years) | 1970 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 93 |
| Practitioners | 23 |
| Teachers | 22 |
| Policymakers | 10 |
| Administrators | 5 |
| Students | 4 |
| Counselors | 2 |
| Parents | 2 |
| Community | 1 |
Location
| United States | 47 |
| Germany | 42 |
| Australia | 34 |
| Canada | 27 |
| Turkey | 27 |
| California | 22 |
| United Kingdom (England) | 20 |
| Netherlands | 18 |
| China | 17 |
| New York | 15 |
| United Kingdom | 15 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Does not meet standards | 1 |
Augustin Mutak; Robert Krause; Esther Ulitzsch; Sören Much; Jochen Ranger; Steffi Pohl – Journal of Educational Measurement, 2024
Understanding the intraindividual relation between an individual's speed and ability in testing scenarios is essential to assure a fair assessment. Different approaches exist for estimating this relationship, that either rely on specific study designs or on specific assumptions. This paper aims to add to the toolbox of approaches for estimating…
Descriptors: Testing, Academic Ability, Time on Task, Correlation
Ben Kelcey; Fangxing Bai; Amota Ataneka; Yanli Xie; Kyle Cox – Society for Research on Educational Effectiveness, 2024
We consider a class of multiple-group individually-randomized group trials (IRGTs) that introduces a (partially) cross-classified structure in the treatment condition (only). The novel feature of this design is that the nature of the treatment induces a clustering structure that involves two or more non-nested groups among individuals in the…
Descriptors: Randomized Controlled Trials, Research Design, Statistical Analysis, Error of Measurement
Timothy R. Konold; Elizabeth A. Sanders; Kelvin Afolabi – Structural Equation Modeling: A Multidisciplinary Journal, 2025
Measurement invariance (MI) is an essential part of validity evidence concerned with ensuring that tests function similarly across groups, contexts, and time. Most evaluations of MI involve multigroup confirmatory factor analyses (MGCFA) that assume simple structure. However, recent research has shown that constraining non-target indicators to…
Descriptors: Evaluation Methods, Error of Measurement, Validity, Monte Carlo Methods
Joyce M. W. Moonen-van Loon; Jeroen Donkers – Practical Assessment, Research & Evaluation, 2025
The reliability of assessment tools is critical for accurately monitoring student performance in various educational contexts. When multiple assessments are combined to form an overall evaluation, each assessment serves as a data point contributing to the student's performance within a broader educational framework. Determining composite…
Descriptors: Programming Languages, Reliability, Evaluation Methods, Student Evaluation
Yongyun Shin; Stephen W. Raudenbush – Grantee Submission, 2025
Consider the conventional multilevel model Y=C[gamma]+Zu+e where [gamma] represents fixed effects and (u,e) are multivariate normal random effects. The continuous outcomes Y and covariates C are fully observed with a subset Z of C. The parameters are [theta]=([gamma],var(u),var(e)). Dempster, Rubin and Tsutakawa (1981) framed the estimation as a…
Descriptors: Hierarchical Linear Modeling, Maximum Likelihood Statistics, Sampling, Error of Measurement
Charles Darr – set: Research Information for Teachers, 2025
This Assessment News article examines the role of measurement error in standardised tests and what it means for progress measures. Charles Darr explains that standardised test scores and the progress trajectories we infer from them do not necessarily reflect a student's true achievement level. Measurement error is the normal day-to-day variation…
Descriptors: Standardized Tests, Scores, Test Interpretation, Error of Measurement
Seyma Erbay Mermer – Pegem Journal of Education and Instruction, 2024
This study aims to compare item and student parameters of dichotomously scored multidimensional constructs estimated based on unidimensional and multidimensional Item Response Theory (IRT) under different conditions of sample size, interdimensional correlation and number of dimensions. This research, conducted with simulations, is of a basic…
Descriptors: Item Response Theory, Correlation, Error of Measurement, Comparative Analysis
Julian Schuessler; Peter Selb – Sociological Methods & Research, 2025
Directed acyclic graphs (DAGs) are now a popular tool to inform causal inferences. We discuss how DAGs can also be used to encode theoretical assumptions about nonprobability samples and survey nonresponse and to determine whether population quantities including conditional distributions and regressions can be identified. We describe sources of…
Descriptors: Data Collection, Graphs, Error of Measurement, Statistical Bias
Oliver Lüdtke; Alexander Robitzsch – Journal of Experimental Education, 2025
There is a longstanding debate on whether the analysis of covariance (ANCOVA) or the change score approach is more appropriate when analyzing non-experimental longitudinal data. In this article, we use a structural modeling perspective to clarify that the ANCOVA approach is based on the assumption that all relevant covariates are measured (i.e.,…
Descriptors: Statistical Analysis, Longitudinal Studies, Error of Measurement, Hierarchical Linear Modeling
Alessandra Rister Portinari Maranca; Jihoon Chung; Musashi Hinck; Adam D. Wolsky; Naoki Egami; Brandon M. Stewart – Sociological Methods & Research, 2025
Generative artificial intelligence (AI) has shown incredible leaps in performance across data of a variety of modalities including texts, images, audio, and videos. This affords social scientists the ability to annotate variables of interest from unstructured media. While rapidly improving, these methods are far from perfect and, as we show, even…
Descriptors: Error of Measurement, Artificial Intelligence, Documentation, Visual Aids
Xiaowen Liu – International Journal of Testing, 2024
Differential item functioning (DIF) often arises from multiple sources. Within the context of multidimensional item response theory, this study examined DIF items with varying secondary dimensions using the three DIF methods: SIBTEST, Mantel-Haenszel, and logistic regression. The effect of the number of secondary dimensions on DIF detection rates…
Descriptors: Item Analysis, Test Items, Item Response Theory, Correlation
Gökhan Iskifoglu – Turkish Online Journal of Educational Technology - TOJET, 2024
This research paper investigated the importance of conducting measurement invariance analysis in developing measurement tools for assessing differences between and among study variables. Most of the studies, which tended to develop an inventory to assess the existence of an attitude, behavior, belief, IQ, or an intuition in a person's…
Descriptors: Testing, Testing Problems, Error of Measurement, Attitude Measures
Qian Zhang; Qi Wang – Structural Equation Modeling: A Multidisciplinary Journal, 2024
In the article, we focused on the issues of measurement error and omitted confounders while conducting mediation analysis under experimental studies. Depending on informativeness of the confounders between the mediator (M) and outcome (Y), we described two approaches. When researchers are confident that primary confounders are included (e.g.,…
Descriptors: Error of Measurement, Research and Development, Mediation Theory, Causal Models
Mark White; Matt Ronfeldt – Educational Assessment, 2024
Standardized observation systems seek to reliably measure a specific conceptualization of teaching quality, managing rater error through mechanisms such as certification, calibration, validation, and double-scoring. These mechanisms both support high quality scoring and generate the empirical evidence used to support the scoring inference (i.e.,…
Descriptors: Interrater Reliability, Quality Control, Teacher Effectiveness, Error Patterns
Hwanggyu Lim; Danqi Zhu; Edison M. Choe; Kyung T. Han – Journal of Educational Measurement, 2024
This study presents a generalized version of the residual differential item functioning (RDIF) detection framework in item response theory, named GRDIF, to analyze differential item functioning (DIF) in multiple groups. The GRDIF framework retains the advantages of the original RDIF framework, such as computational efficiency and ease of…
Descriptors: Item Response Theory, Test Bias, Test Reliability, Test Construction

Peer reviewed
Direct link
